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PREFACE 


This book gives examples of the uses of elementary statistical methods 
in the design and analysis of experiments carried out in industrial plants 
and scientific laboratories. It also deals with several of the statistical 
features of the problem of establishing a systematic program through 
which the quality of industrial output can be studied and controlled. 
There is a final chapter on some of the statistical aspects of the relation- 
ship of sampling to the risks incurred by producers and buyers. 

Those parts of the chapters in large type are meant to be usable by 
themselves; and they are intended for students, experimenters, and 
production men who are short on mathematical training. The notes 
in smaller type are included for the benefit of those who wish to go a 
little beyond the literary exposition of methods. These notes consist 
of comments on methods, rather detailed derivations, and mere outlines 
or suggestions of derivations. Some of the most important topics have 
been noted in the latter fashion for it is, unfortunately, true that many 
statistical techniques — which have long served industrial statisticians 
well — require rather advanced mathematics for their complete deriva- 
tion. 

The manuscript of this book has for the past several years formed 
the basis of a one-semester course in industrial statistics, Economics 38, 
given at the Massachusetts Institute of Technology. Students in this 
course are not expected to have had previous training in statistics. 

The intermediate steps of many of the examples are omitted, but final 
answers are given. These partially complete examples can be used in 
assignments to students. 

Those who work on industrial problems are aware of the obstacles to 
entirely successful use of statistical methods in industry. In particular, 
the lack of complete equivalence between industrial reality and our 
mathematical models thereof and the many technical complexities of 
manufacture and research make it advisable that our results be taken 
as tentative. Only those who are thoroughly familiar with the indus- 
trial or experimental process at hand can obtain the full benefits of the 
simple statistical methods described in this book and in other works of 
this character. In numerous instances in this book, my knowledge of 
the technical processes underlying the data under discussion is slight 
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and consequently my conclusions may have dubious practical signifi- 
cance. 

I am deeply indebted to two former colleagues, Mr. Harold Beilinson, 
now of the War Department, and Mr. L. C. Young, now with Westing- 
house, and to Mr. Churchill Eisenhart, of the University of Wisconsin, 
for many suggestions. Mr. Young and Margaret Z. Freeman have 
kindly carried out many of the computations. I am also indebted to 
our department secretaries, Miss Ethel Downer and Miss Eleanor 
Prescott, for typing the manuscript. Acknowledgments to those who 
have kindly permitted me to use their tables and their data are made 
elsewhere in the book. 

I shall be glad to receive criticism and suggestions from readers. 

H. A. Freeman 


Massachusetts Institute op Technology 
Cambridge, Massachusetts 
May t 1942 
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Statistical procedure and experimental design are only two 
different aspects of the same whole and that whole is the 
logical requirements of the complete process of adding to 
natural knowledge by experimentation. 

R. A. Fisher 
The Design of Experiments 



CHAPTER I 

THE DIFFERENCE OF TWO MEANS 

1.1 Object of Chapter. We wish to design an experiment so that 
from two samples of data we can answer two questions: (1) are the 
averages in the two larger sources of data, from which the samples were 
drawn, equal or unequal; and (2) if they are unequal, within what 
limits can the value of the inequality be established? It is the purpose 
of this chapter to discuss conditions which should be satisfied by such an 
experiment, to illustrate a method of arranging the experiment so that 
even with a small number of observations the precision of inferences 
will be high, and finally to describe the relevant techniques for analyzing 
the data. 

The design and analysis of experiments involving more than two 
averages will be considered in the following chapters. 

1.2 Examples. Experiments having the objective stated above are 
performed in many branches of science. To give a few r examples: in 
medicine and biology studies have been made of the effect of a certain 
amount of thymophysin (as compared to none) on blood pressure; also 
the difference in the number of bacterial colonies per plate when counted 
in the afternoon and in the evening. From industry and agriculture we 
have the difference in the effects of indoor and outdoor storage on the 
breaking strength of w r ood, comparison of the ash content of coals taken 
from two mines, and the difference in the yield of a particular variety of 
wheat under tw r o types of fertilizer treatment. 

1.3 Uses of the results. From such experiments tw^o kinds of infor- 
mation may be wanted. First, what factor or factors are responsible 
for any observed difference in sample averages; and second, as a result 
of the experiment, what action should be taken? We shall consider 
these questions separately. 

The difference in sample averages may be accidental rather than real, 
for samples can have different averages and yet the larger sources of 
data (to be known as 'populations ), from which these samples w r ere 
drawn, may have the same averages. If, however, the difference in 
sample averages is shown to be real, this difference may always be 
attributed separately to the influence of one or more factors and/or 
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to the joint influence of two or more factors. If the experiment is 
designed so that the two samples are unlike only with respect to one 
factor, that factor is held responsible for any real difference in averages 
that may be found. But the samples may be unlike with respect to 
several factors. For example, if it is shown that outdoor storage 
adversely affects the bending strength of wood, the factors responsible 
for the adverse effect may be sunlight and/or rainfall. It is even possible 
that the adverse effect is chiefly due to the joint action of sunlight and 
rainfall, these factors separately having slight influence. This prelimi- 
nary experiment must then be followed by a further set of similar 
experiments in each of which both samples are alike with respect to all 
but one of the suspected factors ; in this way, the responsible factor or 
factors can finally be identified. 

It is possible, however, to plan the original experiment so that it alone 
will yield all this information; examples will be discussed in detail in the 
second chapter. 

Laboratory, factory, and field experiments do not by themselves 
provide sufficient information to determine economic policy. For 
example, an experiment shows that the bending strength of wood is 
impaired by outdoor storage. Users of wood may, however, be partly 
interested in another quality characteristic, such as hardness, which by 
a similar experiment can be shown to be unaffected by outdoor storage. 
Users will, therefore, be willing to pay only a fraction of the premium 
arising from the additional cost of indoor storage and the determination 
of that fraction clearly depends on facts not supplied by either experi- 
ment. If between two methods of manufacture or two types of product 
no real (statistical) difference is found, users will have no definite 

* preference and producers will favor the method or product involving the 
v lesser cost. When a real difference is found, and if that difference is 

* practically significant, the resultant shift in the preference of users, as 
' well as cost differences, will determine the effect on the market of the 

* results of the experiment. 

1,4 Problems facing the experimenter. We shall now consider a 
specific example, but the reader should be able to apply the discussion 
to any experiment involving a difference of two averages. 

An experimenter wishes to determine whether or not the average 
amounts of corrosion of two types of wrought ferrous pipe coating are 
the same. The two types are open-hearth iron and puddled iron. He 
selects several specimens of each coating, buries them in the soil and, 
on later removing them, measures the corrosion of each specimen. If 
the amount of corrosion of open-hearth coating is designated by 
the variable X and that of puddled-iron coating by the variable F, 
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his data on p specimens of the former and q specimens of the latter 
are as follows: 

Xx Yx 

X, F 2 


« • 

x p 

Y* 

In carrying out this experiment, he will have had to make several 
important decisions, and on them depends much of the reliability of his 
later inferences. First, should factors that can be held constant be 
allowed to vary? For example, should all specimens of pipe coating be 
of the same size, be buried in the same type of soil, at the same depth, 
covered with the same backfill, and be left in the soil the same period 
of time? Second, what account can be taken of uncontrollable factors, 
such as the weather after the burial of the specimens? Third, how are 
test specimens to be selected from the larger sources of supply, how 
many should be taken, and should there be a like or unlike number of 
each type of coating? Fourth, what index of corrosion shall be used? 
This is only a partial list but it covers the types of questions that must be 
answered in any experiment of this kind. 

1.5 Desirability of control. Let us first compute the arithmetic 
means X and 7 of the two samples. These averages are given by 

j* _ Xi + Xj + • • • + Xp 

V 

r _ Yx + r 2 + • • • + F g 
Q 

and they can be regarded as estimates of the respective population 
meam X' and 7'. We will consider the magnitude of X — _ Y in the 
light of the tentative hypothesis that the population means X' and T 
are equal. If X - Y is not zero and if we can show that its depar- 
ture from zero was not accidental, the hypothesis X' — 7' — 0 will 
be rejected. _ 

Confidence that X — Y is an accurate measure of X - f' is in part 
dependent on the variability among observations in the populations 
from which the two samples are drawn. Assume that a sample can be 
drawn so that the unknown variability of the variates in the population 
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is proportional to the calculated variability of the variates in the sample. 
If, then, the sample variates Xi, X 2 , • • • , X p differ slightly from each 
other, X will be a relatively reliable measure of the population mean X' 
in the sense that further sampling from the population would not greatly 
affect X. If_X 1? X 2 , • • • , X p vary considerably, X is a less reliable 
estimate of X 7 . This would be the case if, for example, open-hearth 
coatings were buried in several types of soils which differ in their cor- 
rosiveness or if some specimens were removed from the soil before 
others. The argument is similar for Y. 

1.6 Pairing. In the example given, variability among X%, X 2j • • • , 
X p and among 7i, Y 2 , • • • , Y q can be reduced to a minimum by using 
specimens all of the same size, burying them in the same soil and for the 
same length of time, etc. The precision of inference will be thereby 
improved, but the great disadvantages of this type of experiment lies 
in its reduced generality and in the practical difficulty of performing 
such an experiment at all. Complete control over all relevant factors 
— pipe size, kind of soil, and burial period — is practically impossible 
to achieve in ordinary experimenting. 

* The arrangement shown in the following table attains both objectives 

* — practicality and precision. First, it makes possible the introduction 
into the experiment of variability in such important factors as type of 
soil and period of burial, thus making the experiment practically feasible 
and allowing it to simulate conditions of industrial life; and second, it 
excludes the influence of the variability of these factors on the precision 
of inferences relating to the arithmetic means. 



Corrosion 

Kind of soil, length of burial 

Open-hearth 
iron coatings 

Puddied- 
iron coatings 

Difference 

Clay, A years 

Xi 

7 1 

di = Xi - Yi 

Cinders, B years 

x 2 

Y 2 

d 2 a Xs —* Y% 

Loam, C years 

x n 

Y n 

dn — X n — Y n 

Mean 

X 

Y 

©J 

II 

Hi 

1 


Each of the quantities di, d 2) * • * , d n is unaffected by differences 
among various soils and the various lengths of burials, for in each pair- 
ing both kinds of pipe are treated alike with respect to these factors. 
Hence the error of d tends to be small. At the same time, the experi- 
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ment m an ages to include the various soil types and lengths of burial 
occurring in the ordinary industrial use of these coatings. Even the 
uncontrollable factor, variable weather, is also introduced and its effect 
likewise excluded, for the two specimens in any pairing will likely be 
buried side by side. One must, however, recognize the fact that the 
results of this experiment are not as reliable for a particular combination 
of the influential factors — say cinders, B years burial and heavy rain 
after burial — as an experiment in which all 2 n observations were 
devoted to that combination. 

In a single experiment a very wide range in the nature of these combi- 
nations of influential factors is disadvantageous. Thus, in muck soil, 
the superiority of open-hearth coverings may be much greater than in 
any other soil, that is, the value of d will be relatively large in that 
pairing. The increased freedom allowed the experimenter by the 
inclusion of this kind of soil may be offset by the increased variability 
of the variates d{. If this unusual reaction to muck soil is already 
familiar to the experimenter, then muck soil should not be included in * 
the present experiment, for its inclusion is uninformative and the loss in 
precision is costly. 

It is important to note that the estimate, from the paired results, of 
the true error of the mean difference d is based on the n variates di 
whereas in the case in which d was formed from unpaired observations, 
the estimate of error is based on the 2 n variates X{ and F*. In each 
example we shall have to determine whether the increased precision of 5 
resulting from the reduction of variability due to pairing is or is not 
offset by the loss of precision due to a 50 per cent reduction in the 
number of variates. 

1,7 Randomization. The tw T o objectives, precision and practicality, ^ 
are achieved by pairing. The remaining objective is to avoid bias, and „ 
this can be achieved by randomization. 

It may happen that certain influential factors cannot be handled by 
pairing. In such a case, the influences of these factors cannot be elimi- 
nated, but they can be distributed so that our comparison of X and Y 
is not vitiated by their presence. To illustrate the point, assume that 
in the present experiment the orientation of specimens in the soil might 
influence the amount of their corrosion. If, then, all open-hearth 
specimens are buried in the east side of each excavation and all puddled- 
iron specimens in the west side, any conclusion that, say, open-hearth 
coatings are better than puddled-iron coating is now assailable on the 
ground that the east side may have been a favorable position. This 
possibility can be precluded simply by assigning positions to the speci- 
mens of each pairing in random fashion, for example, by tossing a coin. 
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Randomization provides a completely objective technique of removing 
the possible systematic effects of uncontrolled factors, effects which if not 
randomized might vitiate the comparison of the means. 

Size of pipe, i.e., exposed area, somewhat affects. corrosion as measured 
by depth of pits. This factor has not been included in the pairing 
arrangement for the resultant need of drawing a fixed number of each 
size of pipe from the population would interfere with the simple sampling 
technique to be discussed in the next section. This factor should there- 
fore be randomized. The same argument applies to any factor. If 
length of burial is not included among the controls in the pairing arrange- 
ment, random selection of burial periods will preclude the possibility 
that this factor will vitiate the results — something which might 
happen if longer burial periods were unfortunately associated with 
one kind of coating. It is disadvantageous to randomize a factor 
which could be controlled by pairing, for the effect is to increase 
the variability among the d % and therefore the error against which d 
is judged. 

1.8 Selection of specimens. The method of selecting specimens of 
each type of coating must be one which will not vitiate the experi- 
menter's conclusions. For example, if, as a result of biased sampling, 
the open-hearth coatings used in the experiment are better on the 
average than those in their population while puddled-iron specimens 
are, on the average, poorer than those in their population, any infer- 
ence of the nature of d\= X f — Y f ) from the observed data will be 
vitiated. As a second example, if as a result of biased sampling, the 
open-hearth specimens in the sample are more uniform in amounts of 
corrosion than the specimens of their population, an incorrectly high 
precision will be placed on X. 

Such bias can be avoided by selection of the specimens for each sample 
in such a way that all specimens of the corresponding population have an 
equal opportunity of being drawn. Such random selection may be 
carried out in the following way: Assume there are 30,000 open-hearth 
specimens in the population and 40 are to be drawn. Assign numbers 1 
to 30,000 to the specimens in the population. From any page of a 
table of random numbers (numbers composed of randomly selected 
digits) write down in order five-place numbers (omitting all numbers 
over 30,000) until the numbers of 40 specimens have been drawn. 
Similarly for puddled-iron specimens. Among such tables is one by 
Fisher and Yates (16) in which the digits were obtained from the 15th to 
the 19th digits of a set of 20-place logarithms. The direct approach 
would be to draw at random from a well-mixed bowl of 30,000 chips 
marked from 1 to 30,000, but the labor of marking is great. 
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Th§ purpose of random selection is clear but random selection in 
practice may be difficult. For example, in dealing with fibers it is 
impossible to assign a number to each specimen in the population; 
furthermore precaution will have to be taken to avoid the tendency to 
draw the longer fibers. Chance may select a specimen from the center 
of a rug, or the specimen of pipe whose number is drawn may be at the 
bottom of a pile of thousands of specimens. These difficulties necessi- 
tate compromises but every effort should be made to remove subjective 
decision and its attendant biases from the method of selection. 

1.9 Size of the experiment. The number of specimens to be used 
in an experiment is related to (a) the expected value of the mean differ- 
ence, (6) the variability of the variates in the population, and (c) the 
confidence with which our conclusions are to be stated. If in two 
experiments factors (a) and (6) are the same, then the greater the 
desired degree of confidence, the larger must be the size of the samples. 
If (a) and (c) are the same, the greater the variability, the larger the 
size of the samples. If ( b ) and (c) are the same, then the greater the 
expected value of the mean difference, the smaller the size of the samples. 

Another important influence on the size of the experiment results 
from the fact that the variability of the variates in the population, 
whether large or small, must be estimated from the samples. This 
estimate is subject to error, and this error is reduced by use of larger 
samples. 

These general considerations do not enable an experimenter to decide 
whether he will need 10 or 50 specimens. Full information on fac- 
tors (a) and ( b ) may be available only when the experiment is com- 
pleted. If advance estimates of the magnitudes of (a) and (6) can be 
made, the proper value of the size of each sample, n , can be approx- 
imated by formulae to be developed presently. 

1.10 Quality characteristics. Users of an industrial product are 
often interested in more than one of its qualities. For example, both 
hardness and tensile strength may be important. Two possibilities 
are open to the experimenter; (a) he may conduct separate experiments 
for each quality characteristic or, ( b ) if not more than one quality 
characteristic necessitates a destructive test, he can obtain data on all 
characteristics from one experiment. In the case of hardness and 
tensile strength (b) would apply, for the test for hardness is not 
destructive. 

For any one quality characteristic several measures may be available. 
For example, corrosion may be measured by loss of weight or by depth 
of maximum pits. The experimenter should generally choose a measure 
which varies continuously in preference to one wffiich can assume only a 
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few values. An experiment on corrosion using loss of weight or depth 
of pits (both of which vary continuously upwards from zero) yields 
more information than a like-sized experiment in which amount of 
corrosion is measured simply as high, medium, and low. Such a crude 
classification conceals information which a continuous measure reveals. 
We shall not discuss methods appropriate to this crude type of classifi- 
cation, although for a few industrial products it may be the only type 
available. 

1.11 An experiment in detail. Thirty specimens, fifteen of each 
type of coating, are drawn at random from their respective populations. 
One specimen of each type of coating is included in each pair; each 
pair is buried in the same soil, in similar positions, at the same depth and 
for the same period of time. The various pipe sizes, ranging from 
1 inch to 1 % inches, are randomized. The results follow : 


Controls 

Depth of maximum pits (expressed in 
thousandths of an inch) 

Kindt of soil 

Length of 
burial (years) 

Open-hearth 
iron coatings 

Puddled- 
iron coatings 

Difference 

Clay 

4.5 

73 

51 

4-22 

Clay 

3.8 

43 

41 

+ 2 

Cinders 

7.1 

47 

43 

+ 4 

Cinders 

6.1 

53 

41 

+12 

Peat 

2.0 

58 

47 

+11 

Tidal marsh 

4.4 

47 

32 

-f-15 

Loam 

6.5 

52 

24 

4-28 

Clay 

9.2 

38 

43 

- 5 

Clay 

8.5 

61 

53 

4- 8 

Clay 

8.0 

56 

52 

4- 4 

Loam 

5.7 

56 

57 

- 1 

Clay 

3.2 

34 

44 

-10 

Clay 

4.2 

55 

57 

- 2 

Loam * 

6.6 

65 

40 

+25 

Alkali knoll 

6.4 

75 

68 

+ 7 


1.12 General nature of the test of the hypothesis d' = 0. The test 
of th.e hypothesis d' — 0 proceeds as follows: First, considerable infor- 
mation regarding the distribution of di in the population is assumed to 
be at hand. Now assume that from this population of di a very large 
number k of random samples each of n specimens have been drawn and 
the mean of each sample computed. It will be found that mpa.n.g which 
depart considerably from the population mean (d'= 0) occur infre- 
quently whereas means near d'= 0 occur frequently. The frequency 
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g and the fractional frequency or probability g/k of samples whose 
means depart from the population mean by a given amount is thus 
experimentally determinable. We may then determine by actual count 
the probability P of a departure from the population mean as large or 
larger than that actually observed. As a matter of fact, the distribu- 
tion of sample means is mathematically determinable, so there is no need 
for the laborious experimental approach to the determination of P. If P 
is large, the observed difference | d — df | is attributed to the vagaries 
of sampling; if P is small, the difference \ d — d! | is taken to be real 
and the hypothesis df — 0 is rejected; the two materials under investi- 
gation are said to be significantly different in their means. 

1.13 Normality of the population. One of the facts assumed to be 
known of the population is that the frequencies U of values of d* in the 
population are normally distributed; that is, 


[ 1 ] 




where N is the total frequency (total number of observations) in the 
population ( N can be assumed to be infinite) and tr is the standard 
deviation of d, the nature of which will be discussed presently. Just as 
the equation y = ax represents a straight line with slope depending on 
the value of the parameter a, so [1] represents a normal distribution of 
frequencies with exact shape and position depending on values of the 
parameters d! and a. 

Three simple properties of [1] may be noted here. The squared 
exponent of e shows that the frequencies of +(d — df) and ~ (d — d') 
are equal for any d, that is, the distribution is symmetrical around 
d = d\ The maximum frequency 
occurs when the exponent of e is 
zero, which is at d = df. Finally 
/ approaches zero as ±(d — df) 
becomes large. If the probability 
f/N is plotted against the deviation 
d — d', we have the following curve, 
for fixed values of the parameters 
df and <r. 

The technique of testing the hypothesis that the population is normal 
is similar in general nature to the test of the hypothesis df — 0 and to 
practically all other tests that will be made in this book. From the 
data of the sample we compute one or more constants -whose values are 
known for a perfectly normal population. Then allowance is made for 
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the fact that even from a perfectly normal population a random sample 
having non-normal characteristics may by chance be drawn, particu- 
larly if the size of the sample is small. If, for the particular value of n 
used, the departures of the sample constants from the known normal 
values are greater than can be so allowed by chance the hypothesis that 
the sample came from a normal population is rejected. 

Two such constants* are 


E(d - 5') 8 




Third moment of the population about its mean 
(Second moment of the population about its mean ) 3 12 



and 


Mean deviation of the population 

(Second moment of the population about its mean) 1/2 



For a normal distribution = 0 and a = V'S/V. The former is 
obvious because a normal distribution is symmetrical around its mean; 
hence any odd moment about the mean will be zero. Assume that 
from a normal population a very large number of samples, each of size n, 
are drawn and for each sample two constants and a are computed, 
where 


_ Third moment of the sample about its meant 
(Second moment of the sample about its mean) 3/2 


and 



a — 
and 


* Mean deviation of the sample 

(Second moment of the sample about its mean) 1/2 


EM-ai 


j Ttf - 5) 2 J 2 


If the resulting distribution of the frequencies of values of is 
examined, it will be found that values near zero occur most frequently. 

* These parameters should be defined here and elsewhere in this chapter in terms 
of integrals but the summations used here should cause no difficulty. 

t Properly, third moment of the elements of the sample about their ynean, etcj 
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The form of this distribution has been successfully approximated and 
Table I shows values of V^, for given n 9 beyond which 5 per cent and 
1 per cent of all values of V£q of random samples from a normal popula- 
tion are found. Similar information on a is shown in Table III. 

It will be noted that 5 per cent and 1 per cent of the frequencies may 
be interpreted as 5 per cent and 1 per cent of the area under the fre- 
quency curve. Thus, for n = 40 the graph of vhi illustrates the 
situation. 


Brobabili'ty 



Similar arguments hold for o, the distribution of which is not sym- 
metrical about V2/7T ( = 0.798). Thus for n = 41, the graph of a 
illustrates the situation. 



In the present example on the corrosion of pipe coatings we find 

y/bi = 0.330 
a = 0.814 

If fewer than say 1 per cent or 2 per cent of random samples of size 
n — 15 yield values departing by as much as or more than 0.330 and 
0.016 from the expected values 0 and 0.798, the sample at hand cannot 
be considered to have been drawn from a normal population. From 
Table 1, 1 per cent of all random samples have values of exceeding 
1.061 (for sample size 25, the first entry in the table). Now, the spread 
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of the distribution of the Vb\ is less for large than for small n. Hence, 
more than 1 per cent of all random samples of size 15 have \/&i > 0.330, 
and the hypothesis of normality is not refuted. 

From Table III, the 1 per cent 
levels of a are approximately 0.92 
and 0.68. Our value a = 0.814 is 
within this range. Hence the hypo- 
thesis of normality is not refuted by 
this second (and independent) test. 

For small samples these tests are 
sensitive only to large departures 
from normality. The diagram 
shown below of the sample data 
appears somewhat non-normal in 
skewness but the test, which is 
a test of skewness, did not offer 
support. 

1.14 Variance of the population. It has been noted that the relia- 
bility of d depends in part on the variability in corrosion of the speci- 
mens in the population, so a knowledge of the amount of this variability 
is necessaiy. One measure which would seem reasonable is 

N 

N 

but this is not useful, for its value is always zero. 

Z(d - df) - Zd - Nd ; - Ndf - Nd ; = 0 

A second possibility is the average of the sum of the absolute values 
of deviations of observations about their mean, Le., the mean demation 

h±z!A 

N 

This is algebraically an inconvenient measure and it does not fit well 
into the general body of statistical theory. The best measure of 
variability is the standard deviation <r, which has already been intro- 
duced as a parameter of the normal distribution. 

<r 
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We shall work with the variance a 2 . The true value of the population 
variance cr 2 is unknown, but it can be estimated from the sample data. 
If the sample at hand is large, a good estimate (a 2 ) of a 2 is the sample 
variance s 2 , which is given by 

[ 2 ] - a >* 

n 

If the sample is small, we shall later show in part that the appropriate 
estimate a 2 is given by 

m so* - 

m » - 1 

the divisor representing the number of independent values of d (degrees 
of freedom) . Thus from n values of d one constant d has been calculated ; 
hence, given this value of 5, only n — 1 values of d are unfixed, or inde- 
pendent. [3] is always better than [2] but for samples with n > 30 the 
difference may be neglected. We write 

£(d~ R) 2 = Zd 2 - 2 Zdd + Z5 2 
the last two terms of the above can be combined and we have 

EO* ~ 3 ) 2 = - nd 2 = Zd 2 - 

where n is the number of observations. The last form is the most con- 
venient for the purposes of calculation. 

In the present example 

a 2 = 121.571 

1.15 The u test. The appropriate tests of the difference of two 
means may now be described in greater detail. Assume that we are 
given a normal population of known variance a 2 and mean df; what is 
the distribution of the means of random samples each of n observations? 

First, consider a population defined only by S' = 0. If from such a 
population a very large number k of samples each of size n are drawn, 
and the fractional frequencies or probabilities Qi/k of their means d{ are 
calculated, a distribution of the frequency of the various means can be 
plotted. It will be apparent to the reader that (1) the maximum fre- 
quency of this distribution occurs at 3 = 3'(=* 0), (2) the distribution 
of sample means has smaller variance than the parent population, 
(3) the variance of the distribution of means is smaller for large than 
for small n; and (4) if the population is symmetrical about d! (as is the 



14 


INDUSTRIAL STATISTICS 


normal distribution) the distribution of sample means will be symmetric 
cal about d = d f . 

In support of (2) and (3) it will later be proved that if the variance 
of the population is <r 2 , the variance of the distribution of means is o 2 /n. 
In connection with (4) it will be shown that if the population is normal 

the distribution of the sample 
means will also be normal. 

If the fractional frequency or 
probability of random samples 
with means greater than the mean 
of our sample d (and less than — S) 
is low, say less than 5 per cent, 
the hypothesis that our sample 
was a random sample from a nor- 
mal population of mean d! = 0 
and variance o 2 is hardly tenable. 
If this is found and if (1) our 
sample is random and (2) the variance of the population is cr 2 and (3) 
the population is normal, it follows that d! ?£ 0. The means of the 
two materials are significantly different. 

This test of differences of means, which will be called the u test, is thus 
based on the following theorem: Given a normal population of mean d! 
and variance a- 2 , the means of random samples each of n observations 
will be distributed normally with mean dl and variance <j 2 /n. This may 
also be expressed as follows: given 

o , d( 

variance cr, the statistic u = —{ « 

mean d! and variance unity. 

1.16 The t test. In our case the variance of the normal population 
is unknown; it must be estimated from a small sample, and the u test 
must be modified. The statistic 


will be distributed symmetrically with mean d f , but the distribution 
will be somewhat more peaked and will, in general, have a wider range 
than the normal distribution, with its shape depending on the number 
of independent observations (called degrees of freedom) from which 
the estimate £ 2 is calculated. The test of significance is called the t test, 
and the peaked distribution is known as “ Student's ” distribution. 



a normal population of mean d and 

d \ . 

" J^T j 18 normally distributed with 
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In our example the population of differences is normal; also d* = 0, 
by hypothesis, and <r 2 = 121.571; the variance <j\ of the distribution of 
means is a 2 /n = 8.105 and the stand ard deviation (standard error) of 
the distribution of means is V8.105 = 2.847. The difference d — d! 
is 0.008 inch. We express this difference in terms of a measure of its 
error such as <r^; this division of one linear function (d — d') by another, 

, eliminates the effect of the units of the difference and 

permits the use of a single set of tables for all problems. The differ- 
ence of 0.008 inch is 8/2.847 or 2.81 standard error units. 

The deviation is 2.81 standard error units, and the estimate of the 
variance is based on 14 degrees of freedom. From Table V the proba- 
bility of exceeding by chance a deviation of 2.81 standard error units is 
only 0.015, approximately. Thus the two kinds of pipe are significantly 
different in their rates of corrosion. If the two types differ only 
in one characteristic, say method of manufacture or inclusion or 
exclusion of slag, this factor can be held responsible for the difference 
in quality. 

1.17 Analysis of unpaired variates. If kind of soil and length of 
burial do not affect corrosion, it is disadvantageous to consider specimens 
to be paired with respect to these factors, for the variates di will be no 
less variable than X; and F*, and there are only n variates di in place of 
2 n variates X* and Y{. If kind of soil and length of burial affect corro- 
sion, pairing will likely be advantageous. To determine the gain or loss 
resulting from pairing, the variates, considered as unpaired, must be 
analyzed. 

Given a normal population of mean X f — Y r = 0 and of variance o 2 ; 
assume that from this population two random samples are drawn, of 
size nx and ny, and that the difference of their means is X — F. If a 
large number k of such dual drawings are made, the resulting k values 
of X — Y may be grouped into a frequency distribution. It should be 
apparent that (1) this distribution of means will center about 
it* — Y f = 0; (2) the most frequently occurring value of X_— F will be 
zero; (3) the distribution will be symmetrical about X'— F'= 0; 

(4) it will probably have smaller variance than the population, and 

(5) its variance is small when n x and n Y are large. It will later be 
proved that this distribution of the difference of means is normal with 
variance 





16 


INDUSTRIAL STATISTICS 


One important difference between the analysis of paired and unpaired 
variates lies in the estimate of the population variance. In the case of 
paired variates, we have a single sample and the estimate S 2 is given by 

X(d- d) 2 m £[(X - Jg) - (7 - ?)] 2 
n — 1 w — 1 

With unpaired variates the estimate a 2 will be shown to be 

n X 7ly 

£(X-Z ) 2 + £(F- F ) 2 
nx + ft y — 2 


Note that the estimate a 2 from a single sample of differences is based 
on ft — 1 independent differences whereas if the original variates are 
not, or are considered not to be, paired, the estimate is based on nx + 
ft y — 2 independent variates. 

We have, for the unpaired variates 


d! = 0 

5 = 8 

ft - 15 
a 2 = 125.029 


s?(- + -\ = 16.671 
\n nf 


and the standard error of the difference of means is 



The deviation in standard error units is 


8 

4.08 


= 1.96 


which for 28 degrees of freedom is not significant, for P is greater than 
0.05, whereas in the analysis of the paired variates, the difference was 
significant. In this example the gain in sensitivity from pairing out- 
weighed the loss of half of the degrees of freedom, and the testimony of 
the paired variates may be accepted. This gain in sensitivity resulted 
from the exclusion, by pairing, of the effects of factors which affected 
both samples. 
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1.18 Equality of the variances. In testing the significance of the 
difference of the means of unpaired variates; the statistic t was com- 
puted, where 

d 


4 


L( x- X ) 2 + £(r 

nx + rcy — 2 


1)1(1. + A 
W »r. 


) 


This might have been written 
t = 


jnx fx + n y s| /J_ + J_\ 

\ Tix + fty — 2 \nj n y) 

where sx and Sy are the sample variances. The test of the means of 
unpaired variates results in the acceptance or rejection of the hypothe- 
sis that the two normal populations have the same mean X' = Y' 
and the same variance <r| = <r|. If <r| ^ any inference regarding 
the validity of the hypothesis X r = Y' is open to question, for a large 
value of l may reflect differences in variances rather than differences in 
means. To test the hypothesis <j\ = oy we compute from the two 
samples the value of the statistic L\ which for k samples is given by 

/AS . . . A 

Ii 


_ / gM • • • 4 \ ] 
Wo * • ’ sl) 


where 


a! 1 


If the variances are identical, sf = sf = • * • = > -hen L\ — 1. 

The distribution of L\ for random samples from a normal popula- 
tion has been approximated and Table X shows values of L h for samples 
of various sizes, beyond which 5 per cent and 1 per cent of all values of L\ 
lie. Note that L\ always lies between 1 and 0. We have 


from which 


sf = 125.09 
si - 108.29 
si = 116.69 

Li = 0.997 


This is far above the 5 per cent level of L\ shown in Table X (the 5 
per cent level of L\ is 0.8673); accordingly the variances of the two 
normal populations from which these samples were drawn are not sig- 
nificantly different. 
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1.19 Other tests of significance of the difference of two means. 

If the population is definitely not normal, it is necessary to use a test 
not assuming normality. One such test has been given by Wald and 
Wolfowitz (45). If the L\ test indicates that the variances differ 
sig nifi cantly, a test of the hypothesis X f = Y f has been proposed 
by Fisher and Behrens (Sukhatme, 40, for examples, tables). Often 
in experimental work, both normality and equality of variances will 
be found or can be assumed, and in such instances Student's t test, 
which incorporates normality and equal variances into its hypothesis, 
should be used; the t test (or for large samples, the u test) under 
these conditions will be more sensitive than a test which is designed to be 
valid for more general conditions. 

1.20 Further examples. Beckwith (2) gives the following data for 
tuft bind tests on each of two rugs. The values are unpaired; only 
one test of significance is available. 


Rug No. 1 

Rug No. 2 

10.0 

10.5 

10.5 

9.5 

9.5 

8.5 

18.5 

9.0 

14.0 

8.5 

14.0 

12.0 

12.0 

8.0 

9.5 

10.5 

12.5 

7.0 

10.0 

10.5 


Are the population means significantly different? The test already 
used may be summarized as follows: If a large number of pairs of small 
samples of size nx and n Y respectively are drawn at random from a 
normal population of mean d! = X* — Y' and variance <r 2 , the quan- 
tity d/<r a is distributed as Student's i with nx + n Y — 2 degrees of 
freedom, where 

a = x- y 



and the best estimate c 2 of the unknown variance cr 2 is 

£(X - Xf + E(F - f) z 


n x +n Y - 2 
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We have 


c 2 = 5.174 
o' = 2.275 



For t = 2.605 with 18 degrees of freedom we find from Table V that 
P < 0.02. The means are significantly different. 

The following data showing the results of field tests on the corrosion 
of non-bituminous pipe coatings for underground use have been given 
by Logan and Ewing (25). 


Soil type 

Lead coated 

STEEL PIPE 

Bare steel pipe 

A 

27.3 

41.4 

B 

18.4 

18.9 

C 

11.9 

21.7 

D 

28.7 

9.8 

E 

11.3 

16.8 

F 

14.8 

9.0 

G 

20.8 

19.3 

H 

21.6 

11.1 

I 

17.9 

32.1 

J 

7.8 

7.4 

K 

18.6 

68.3 

L 

14.7 

20.7 

M 

19.0 

34.4 

N 

65.3 

76.2 


Do these two types of pipe differ significantly in their resistance to 
corrosion? 

Analysis of the impaired variates yields 

t = -0.931 

which for 26 degrees of freedom is not significant. 

In this example there is some evidence that the data in any one row 
are not independent. Type of soil is probably responsible for any 
such lack of independence; for example, soil N appears to be highly 
corrosive to both types of coatings whereas soil J has slight effect regard- 
less of covering; this “positive” correlation is, however, not in 
evidence in all pairs. Analysis of the paired variates gives the results 
shown on page 20. 
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Difference in 

Som TYPE 

i PENETRATION 

A 

-14.1 

B 

- 0.5 

C 

- 9.8 

D 

+18.9 

E 

- 5.5 

F 

+ 5.8 

G 

+ 1.5 

H 

+10.5 

I 

-14.2 

J 

+ 0.4 

K 

-49.7 

L 

- 6.0 

M 

-15.4 

N 

-10.9 


Mean = - 6.357 

and 


t 

- —1.487 

From Table V, for 13 degrees of freedom and t = —1.487, we have 
P = 0.17, approximately. The value P = 0.17 is well above the critical 
level, 0.05. Both tests indicate that one type of pipe is not more liable 


to corrosion than the other. 


Fieldner and Selvig (13) give the following data on the ash content of 
dry coal Each pair of samples came from a different coal supply. 


Sample A 


8.91 

11.47 

9.81 

9.34 
9,73 

10.22 

8.35 

10.19 
11.49 

13.20 
13.73 
11.51 
10.60 
11.11 
10.39 
10.59 

9.88 

11.18 

10.58 


Sample B 


9.02 

11.36 

10.63 

9.44 

9.88 

10,03 

10.26 

10.20 

11.45 

12.95 

14.42 

11.21 

10.60 

10.94 

10.05 

11.20 

9.87 

11.51 

11,27 


Sample A 


13.04 

12.75 
11.52 
10.03 

10.75 

9.77 

11.90 

13.66 
12.94 

12.36 

5.89 

6.22 

5.27 

5.69 

5.47 

5.05 

5.25 

12.66 
12.12 


Sample B 


13.08 

12.23 

11.65 

10.21 

10.06 

10.16 

12.11 

13.08 

13.12 

12.83 

5.75 

5.99 

5.36 
5.91 
5.33 
4.93 

5.37 
13,01 
12.56 


Sample A 


12.82 

9.87 
8.85 

10.49 

9.16 
11.35 
12.29 

7.95 

9.14 

9.32 

4.16 

8.41 
5.70 
4.43 
4.69 
4.51 

3.42 

3.87 
4.25 


Sample B 


12.79 

9.69 

9.22 

10.58 

9.39 

11.72 
12.43 

7.44 

9.77 

10.01 

4.08 

8.72 

6.01 

4.40 
4.52 
4.50 
3.32 

3.77 
4.06 





THE DIFFERENCE OF TWO MEANS 


21 


It is clear that the pairs of values are positively correlated, coal source 
being the control. The samples each weigh 3 pounds and are prepared 
in identical fashion, so any differences between sample A and sample B 
are expected to be negligibly slight. Do the data support this expec- 
tation? 

We form differences: 


Sample A — Sample B 

Sample A — Sample B 

Sample A — Sample B 

-0.11 

-0.04 

+0.03 

4-0.11 

+0.52 

+0.18 

-0.82 

-0.13 

-0.37 

-0.10 

-0.18 

-0.09 

—0.15 

+0.69 

-0.23 

+0.19 

-0.39 

-0.37 

-1.91 

-0.21 

-0.14 

-0.01 

+0.58 

+0.51 

+0.04 

-0.18 

-0.63 

+0.25 

-0.47 

-0.69 

-0.69 

+0.14 

+0 08 

+0.30 

+0.23 

-0.31 

0.00 

-0.09 

-0.31 

+0.17 

-0.22 

+0.03 

+0.34 

+0.14 

+0.17 

-0.61 

+0.12 

+0.01 

+0.01 

-0.12 

+0.10 

-0.33 

-0.35 

+0.10 

-0.69 

-0.44 

+0 19 


We then find 

t = -1.972 

From Table V, for t = —1.972 and 56 degrees of freedom P is slightly 
below 0.05. The sample means may be considered significantly differ- 
ent, althou gh the margin is slight. We conclude that experimental 
technique is subject to improvement, or that the samples A and B 
differ with respect to a non-randomized factor. 

If the exceptional deviate d = —1.91 (the seventh value) is omitted, 
the result is 

t = -1.679 

which, for 55 degrees of freedom gives F > 0.05 and the difference in 
rnran ash content is judged not to be significant. Omission of an 
observation or observations is an unsound practice and should be done 
only when the investigator has reason to believe that the observation 
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in question was subject to special influences not affecting the remain- 
ing observations. 

1,21 Large samples, paired variates. The estimate from small 
samples of the population variance is subject to error, and the amount 
of the error depends on the size of the samples. Hence in determining 
the probability P that a given value of t — 5/?a could be exceeded in 
random sampling from a normal population we must take into account 
the number of independent observations (degrees of freedom) on which 
the estimate of the population variance is based. Accordingly, the 
probabilities of exceeding t given in Table V depend on the number of 
degrees of freedom. 

If the experiment is relatively large, with, say, more than 30 obser- 
vations in each sample, the population variance can be assumed to be 
given exactly by the sample variance, and the distribution of t passes 
into the normal distribution of u (areas under w r hich are given in Table 
IV). The probability P that a given value of u = d/e could be ex- 
ceeded does not involve the concept of degrees of freedom; this is 
evidenced by the absence of degrees of freedom in Table IV. The 
normal approximation to t is completely valid only if the sample size n 
is infinitely large; only the probabilities associated with the infinite 
sample sizes shown in the last row of Table V -will correspond to those 
of Table IV. For example, in Table V for n = oo and t = 1.96, we find 
P = 0.05, as indicated in the illustration at the left. 



From Table IV for u = 1.96 we find a value of 0.475; the area of the 
two tails is 0.05 as before; this is shown in the illustration at the right. 
If n > 30, the normal values given in Table IV may be safely used. 

Let us reanalyze the data on the ash content of coal, now considering 
the sample of d* to be large. We have 

d f * 0 
d - 0.1079 
n - 57 
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a 2 = * 0.168 and cr = 0.41 

0.41 

= — t= = 0.054 
V57 


£ ^ 0.1079 
<ra 0.054 


— 1.99 


From Table IV, P = 0.0466. The difference is judged barely signifi- 
cant, as before. 

1.22 Large samples, impaired variates. If the observations in two 
small samples are, or are considered to be, unpaired, the estimate S 2 of 
the variance of the normal population is 

£(Z-X) 2 + E(F- F) 2 ^ / nx4 + nrgf \ 

nx + n y - 2 \nx + n y - 2/ 


and the distribution of the statistic 


t 


d 


Z-d 



is known but is not normal. If, however, the two samples are large, 
say nx > 30, ny > 30, the population variance may be assumed to be 
known and to be given by the weighted mean of sample variances, i.e., 


[4] 


n x s x + n Y Sy 

nx + ny 


and the statistic t, which we have called u under these conditions, may 
be written 

d d 


/ 1 

, 1 

<7 \~— 

4 

\n x 

ny 


where a is the square root of [4]; u is distributed normally. 

1.23 Examples of 1.22. The British Cotton Industry Research 
Association (5) records the following results of breaking load tests on 
two types of yam; 


Type of Mean breaking Standard 

YARN LOAD IN OUNCES DEVIATION 

X 6.83 1.23 

Y 7.48 1.33 


Number in 
sample 
1782 
1914 
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Do the yams differ significantly in their mean values? We have 


<4 - 


w xsj- + IlySy / 1 1 

n x + n Y \nx n 7 . 


) 


2 2 
+ £r_ 

ny nx 


- d- 23 ) 2 + 

(1.33)* 

1914 

1782 

= 0.001783 

d 

0.65 

°Cl 

0.04223 


15.39 


This deviation is so improbable that it cannot be located in Table IV. 
Hence the yams differ significantly in their means. The difference is, 
however, only 0.65 ounce and may be of slight practical significance. 

Van Rest (43) gives the following data and calculations on the effect 
of stain (outdoor storage) on the hardness and bending strength of 
wood. 


Hardness 
Stained Unstained 


Bending Strength 
Stained Unstained 


Number of tests 40 100 

Mean 117 132 

Sum of squares about mean 8,655 27,244 


40 100 

6,184 6,270 

16,799,390 30,459,499 


Are hardness and bending strength significantly affected by stain? 
Previous formulae yield 

, S(A-1) 2 + S(F — F) 2 

0*2 = 

Tlx * ny 

we obtain: 

d> 

Hardness <j% - 2.996 and — =5.007 

ct 

Bending strength c a = 108.7 and — = 0.791 


Hardness is really affected by stain, whereas bending strength is not, 
for the probabilities from Table IV are respectively P = 0.0000003 
(highly significant) and P = 0.2148 (not significant). 

1.24 Examples in which the hypothetical mean is not zero. Fre- 
quently in industrial practice we may want to use as the population 
mean the mean of a large number of observations of an earlier date or 
a figure set by a standards-making body. The practical problem is to 
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determine whether or not the mean of the current sample differs signi- 
ficantly from such a population mean. 

The first illustration deals with small samples. Beckwith (2) gives 
the following data on the pile wool content, in ounces per three-quarters 
of a yard, of a fabric. 


26.0 

27.2 

26.5 

26.8 

27.0 

Quality specifications require a mean of 27.4. Are the data of this 

small_ sample compatible with the hypothesis that the mean of the 

population from which the sample was drawn is 27.4? 

The appropriate method of analysis — which has already been used — 

is summarized as follows: If a quality characteristic X is normally dis- 

- X — X f 

tributed with mean X f and unknown variance <r 2 , the quantity - — 

is distributed as Student’s i with n — 1 degrees of freedom, where 
S "2 — and 


and the best estimate a 2 of the unknown variance a 2 from a single 
sample is 

TAX - X ) 2 

n — 1 


A test of normality of the population would have to be based on five 
observations ; it will not be attempted. We have 


1 - X' 
*2 



—3.34 


For ra — I, i.e., for four degrees of freedom and for t - —3.34 we find 
P = 0.015 (15 samples in 1000). As this is a very low probability, the 
material at hand must be considered significantly different from the 
specification in its mean. 

The following example illustrates the case for large samples : Pettebone 
and Young (32) record the following 1306 readings on the heat value in 



26 


INDUSTRIAL STATISTICS 


Btu of a mixed gas. The data cover a period from January 1932 to 
January 1937. 


Btu 

Midpoints 

Number of 

548 . 5 - 550.5 

549.5 

6 

546 . 5 - 548.5 

547.5 

3 

544 . 5 - 546.5 

545.5 

6 

542 . 5 - 544.5 

543.5 

30 

540 . 5 - 542.5 

541.5 

57 

538 . 5 - 540.5 

539.5 

118 

536 . 5 - 538.5 

537.5 

202 

534 . 5 - 536.5 

535.5 

260 

532 . 5 - 534.5 

533.5 

284 

530 . 5 - 532.5 

531.5 

197 

528 . 5 - 530.5 

529.5 

103 

526 . 5 - 528.5 

527.5 

36 

524 . 5 - 526.5 

525.5 

3 

522 . 5 - 524.5 

523.5 

0 

520 . 5 - 522.5 

521.5 

1 


1306 


On 64 days at irregular intervals in the 5-year period, state inspection 
was conducted. The 64 observations which constituted an apparently 
random sample from the population of 1306 observations are given in the 
following table. 


Btu 

(midpoints) 

549.5 

547.5 

545.5 
543 5 

541.5 

539.5 

537.5 

535.5 

533.5 

531.5 

529.5 

527.5 

525.5 

523.5 

521.5 


Number of days 
1 
1 
3 
3 

5 
10 
11 

9 

8 

6 
5 
0 
1 
0 

64 


So far as means are concerned, is it likely that this constitutes a 
random sample from the given population? 

The appropriate procedure is summarized as follows: If a quality 
characteristic X is distributed normally -with mean and variance a 2 , 
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the means of random jiamples each of n observations mil be distributed 
normally with mean X - X f and variance a 2 /n. 

In the present example a test of the normality of the population is 
hardly necessary, for it will presently be noted that if a sample is larger 
than 50 and if the population is as much as 10 times as large as the 
sample, the tendency to normality of the distribution of the means of 
random samples is negligibly affected by the nature of the population. 
If a test is to be applied, the statistics a and Vb\ or b 2 and Vb~i are 
computed. When testing the normality of the parent population 
from a small sample, the statistic a is better than bo. In the present 
example 1306 observations are available, and we shall use the more 
familiar f> 2 test. For a normal population 

Z(X - x'f 

N 

$2 « — - 3 

^L(X-X' 2 ) J 

The distribution of 

Z( X ~ l) 4 

n 

- X) 2 ! 2 

n J 

is known and the 5 per cent and 1 per cent values are given in Table II. 
For our data 

Vbi = 0.43 
b 2 = 3.58 

which indicate that the population is not normal, for both tests yield 
probabilities of less than 0.01. Normality of the means can, however, 
be assumed. We have 

X' = 534.99 
= 3.85 



X = 536.72 
tt = 64 
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Also 


2 <r 2 (3.85) 2 

n 64 
a% — 0.481 


0.2316 


To enter Table IY, form 

X- X' 
u = 


1.73 

0.483 


+3.60 


From Table IV we find that only 4 times in 10,000 trials would 3.60 
be exceeded if chance alone is responsible for the deviation. This is a 
very small probability; therefore the mean of the inspector’s readings 
departs significantly from the population mean, and reasons for this fact 
should be sought. 

The problem and its solution is shown in the following illustration. 
Each shaded area P is 0 . 0002 . 


Means of random 
samples each of 
size n.— 64 


A smoothed distribution 
of 1306 observations 



Heat content in Bill 

sS 

K — 3.60 cr- 


NOTES 

1.25 The V bi and 62 tests for normality, in detail. The following example 
illustrates in detail the V h. and b 2 tests for normality. In connection with the 
example we shall show certain short-cut methods of calculating the mean and 
the variance. 

Pulsifer (33) gives the following data on the tensile strength, in actual load 
pounds, of 1000 cap screws of a certain dimension. 
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Tensile 
strength 
in pounds 

Number 

of 

screws 

Tensile 
strength 
in pounds 

Number 

of 

screws 

15,500 

1 

17,800 

41 

600 

. . 

900 

29 

700 

6 

18,000 

42 

800 

8 

100 

49 

900 

4 

200 

48 

16,000 

11 

300 

41 

100 

6 

400 

33 

200 

15 

500 

48 

300 

11 

600 

52 

400 

18 

700 

28 

500 

5 

800 

48 

600 

10 

900 

27 

700 

19 

19,000 

35 

800 

23 

100 

25 

900 

19 

200 

15 

17,000 

20 

300 

15 

100 

23 

400 

8 

200 

36 

500 

3 

300 

33 

600 

3 

400 

35 

700 

1 

500 

31 

SOO 

2 

600 

33 

900 

, , 

700 

39 

20,000 

1 




1000 


Is the population of tensile strengths normally distributed? 
The normal population distribution is given by 


[5] 




where y is the fractional frequency ( ■ ] 0 r probability of 

\total frequency A / 

screws with tensile strength X f X' is the mean tensile strength, and a is the 
standard deviation. For [5] the values of Vft and ft for AT—* =o are respec- 
tively 0 and 3, where 
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v k being defined as the fcth moment of the deviation x - X — X', i.e., 

f x k —^=.e-^dx 

J- CO <rv27T 


Vk 


Estimates of X', cr, and ft from the data of large samples are 

n 

x (-» x ') = — 


, , „ Zix-X ) 2 , 

s 2 (-*<r 2 ) = — i (= ftO 

n 


A (-> A) = 

(-“»&) = ^4/^2 


where 


Vk (— » P&) 


g(X - X) fc 
n 


and n is the number of observations in the large sample. 

We regroup the data to facilitate computation, although a minor grouping 
error is thereby introduced and certain information is lost. A correction will 
later be introduced which will partially remove this error. 


Tensile strength 

Number of 

Class Midpoints 

Screws 

15,500 

1 

800 

18 

16,100 

32 

400 

34 

700 

52 

17,000 

62 

300 

104 

600 

103 

900 

112 

18,200 

138 

500 

133 

800 

103 

19,100 

75 

400 

26 

700 

6 

20,000 

1 

1000 


To compute the moments Vk it is convenient to substitute for X a $ew 
variable Z: 


X ~a + cZ 
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where a is an arbitrary constant and c is the interval between class midpoints; 
in our case c = 300. Summing this expression over all observations and 
dividing by n we obtain 

X - a + cD 


where D = 51 fZ/n, f being the frequency of X (or Z). By substituting these 
relations into the equation for ixi we obtain by simple algebraic expansion the 
following equations, which are more convenient for purposes of calculation: 

nv 2 = C“(LfZ 2 - nD 2 ) 

ms = c s (LfZ 3 - 3 DZ/Z 2 + 2 nD 3 ) 

mi = c i (LfZ i - 4J)EfZ 3 + - 3 nD*) 

Notice that the class interval c will disappear in calculating \/i>i and 
The computations are carried out in the following table. The last column, 
suggested by Charlier, is for check purposes, for 

Lf(Z + l) 4 = EfZ i + 4 EfZ 3 + 6 L/2 2 + 4 EfZ + n 


X 

Tensile 

Strength 

Class 

Midpoints 

/ 

Number 

of 

Screws 

Z 

Deviations 
in Class 
Interval 
Units from 
a * 17,900 

fZ 

fZ 2 

JZ % 

fZ* 

f(Z + 1) 4 

15,500 

1 

-8 

—8 

64 

-512 

4,096 

2,401 

800 

18 

-7 

— 126 

8S2 

-6,174 

43,218 

23,328 

16,100 

32 

—6 

-192 

1,152 

-6,912 

41,472 

20,000 

400 

34 

—5 

-170 


-4,250 

21,250 

8,704 

700 

52 

—4 


832 

-3,328 

13,312 

4,212 

17,000 

62 

—3 

-186 

558 

-1,674 

5,022 

992 

300 

104 

-2 

-208 

416 

-832 

1,664 

104 

600 

103 

-i 


103 

-103 

103 


900 

112 

0 


.... 



112 

18,200 

138 

+1 

138 

138 

138 

138 

2,208 

500 

133 

+2 

266 

532 

1,064 

2,128 

10,773 

800 

103 

+3 

309 


2,781 

8,343 

26,368 

19,100 : 

75 

+4 



4,800 

19,200 

46,875 

400 : 

26 

+5 

130 


3,250 

16,250 

33,696 

700 j 

6 

+6 

36 

216 

1,296 

7,776 

14,406 

20,000 

1 

+7 

7 

49 

343 

2,401 

4,096 


1000 


-1,201 

8,569 

-23,785 


198*275 




+1,186 


+13,672 






-15 


-10,113 
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The Charlier check indicates that the calculations have probably been made 
correctly, for 

198,275 = 186,373 + 4(-lG,113) + 6(8569) + 4(— 15) + 1000 
= 198,275 

The moments are 


(300) 2 - 8.568775 (300) 2 

™ 1000 

- 10 ' 113 - 3 (ii) <8569> + 2(1000) (si)' ... 

M3 1000 1 

= -9.727402 (300) 3 

M4 = 

186 ' 373 - 4 (if)- 10 - 113> + 6 (ii) ,(8569) - T . 

1000 ' { } 

« 185.777788 (300)* 
from which 

Vh = 0.39 
h = 2.53 

Are these values sufficiently close to 0 and 3? From Table I we find that 
fewer than one in one hundred samples of size n = 1000 would have 
further than 0.39 from 0. For 6 2 = 2.53, this probability is again <0.01, as 
shown in Table, II. Hence the present sample cannot be assumed to have 
been randomly drawn from a normal population. _ 

1,26 Correction of the moments. In estimating V and 0 2 the moments 
^ may be corrected for errors resulting from grouping the original observations 
into classes. The error arises from the fact that we wish to estimate the values 
of VK and ft of a continuous curve whereas our data form a discontinuous 
curve. The adjustments most generally applicable are those due to Sheppard 
(48). These adjustments assume that the corrected distribution has high 
order contact (very gradual tapering) with the X axis at its extremities. 

/x -2 (corrected) ~ M 2 — tjc 2 
Ms (corrected) = M 3 
M 4 (corrected) = M 4 - SM 2 C 2 + 

For our data 

Mg (corrected) » 8.485442 (300) 2 
Ms (corrected) - -9.727402 (300) 3 
Mi (corrected) - 181.522567 (300) 4 
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and 

V&T = 0.39, b t = 2.52 
which yield the same conclusions. 

The given data and the closest approximating normal distribution are shown 
in the following figure (actual fitting of a normal distribution to industrial 
data as a supplement to the V &i and b<> test used here is seldom fruitful and 
the technique is not discussed here). 



1.27 Some properties of the normal distribution. The X-variate of the 
normal distribution extends from — cc to + 00 ; tensile strength, however, 
could not fall below zero. No serious error is introduced by this discrepancy. 

In addition to properties already given, certain others may be noted. Writ- 
ing x = X — X' we find, for a population of size N 




e~* n fdx ~N 


that is, the area is iV, the total frequency of observations. If] we use the 
fractional frequency, i.e., probability y instead of / = yN, we find 



cr v 2x 


1 


The probability of x falling between d= is unity; the probability of x falling 
between x\ and x% is given by 



l 
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Areas under this normal distribution (with abscissa x/c) are given in Table IV. 
In that table x x is taken at zero. 

The points of inflection for the normal curve occur at 3 = =bcr. We have 






<7 S V 2ir 


from which 


X — =h<7 


Other properties of the normal distribution are discussed later. 

1.28 Moments. While the moments Vk about the mean are used in this 
chapter in connection with normality, their definition is general. For a dis- 
continuous distribution of n observations — with which we are always faced 
in practice — the kVa moment about the mean has been defined by 

£(X - xf 


For a continuous population the corresponding definition of the kVa moment 

v k about the mean is 

Vk « f yx* dx 


/ y $ X x = x~r 

x dx ' 1.29 That Vpi = Oandftj = 3 for 

a normal distribution. A simple proof 
has been given by Bowley (3). The odd moments for the normal distribu- 
tion are all zero; hence Vft = 0. To prove, let v 2t +i be any odd moment. 
Then 

%-t-i 83 f & 2 ** 1 — 7= dx 

J -co crv2ir 


J*~ <p(x)dx 

<p(x)dx + f <p(x)dx 

0 J -CO 


In the last term of the above substitute —a/ - x, The limits become 00 and 0. 
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We obtain 


v 2t+l 


= f <p(x)dx - f <p(—x')dx' 

Jo J * 

VQ 


<p{x)dx + f <p(—x')dx' as 0 
for <p(x) a® — <p(—z r ) in the function 

¥>(x)<ir = s 21+1 - 4 = e~* l2a *dx 

crv 2 tt 

This result would be obtained for any symmetrical distribution. Finally 




IL 

vV* 


= 0 


for non-zero v%. 

To determine the value of $ 2 for a normal distribution, first consider v 4 . 

- 4 = dx 

& v 2 tt 

The solution is of the form 

[6] a .3 € -r2/2(T 2 

for, after inclusion of the appropriate constants, the derivative of [ 6 ] yields the 
fourth and second moments. 

Omitting constants, we have for the derivative 

— ~ x* e"* 12 * 2 4 . Zxh~* t%a% 

er 2 



Including the constants, we find 


$ V** /2 °* - — f — x k e~^ nal dx 

Jv 2tt J-co J crv2ir 


3ffV— 4=e"* !/ 2 ffJ da: 

<7V27r 


or 

0 ** —Vi + Zvl 

or 

02 = * 3 


*2 
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1.30 Short method of computing moments. Consider n 2 . We have 

E(X - l) 2 


M2 = 


M ak e the substitutions X = a + cZ and X = a + cD where 


D 


Zz{_ZfZ 


K- 


when the data are grouped 


) 


We obtain 


Hi = 


Z(X - X) 3 _ Z(g + cZ - a - c£>) 3 


c 2 E(£ - D ) 2 


c 2 (S z 2 - 2 Ezd + Ed 2 ) 




a form which facilitates rapid calculation. Similar forms have been given for 
^md ix 4 . 

1.31 Distributions of VFi and & 2 « In order to determine whether or not 
the computed values of Vfo and 62 differ significantly from the normal popu- 
lation values V^i ss 0 and ft — 3, we need to be able to answer this question: 
If a large number of random samples each of size n are drawn from a population 
known to be normal and if the statistics V & x and h are computed for each 
sample, what will be the distribution curve of and that of 62? The answer 
is not yet exactly known but the moments of the two distributions have been 
given by R. A. Fisher (15, b ) ; on the basis of Fisher’s results, E. Pearson and 
finally Geary and Pearson (20) have constructed approximate tables for various 
values of n; they have done similar work on 0. 

For very large n, the distributions of V bi and ft approach normality with 
standa rd de viations (usually called standard errors) respectively of V6/n 
and V24 Jn, approximately. For n < 1000, the values given in Table I are 
to be preferred to normal approximations. 

Assume that for n = 300 we have ~ —0.230. If this value is judged 
by reference to the areas given in Table I, we conclude that there are five 
chances in 100 that Vft = —0.230 could have been exceeded (in a negative 
direction) in random sampling from a normal population. How does this 
compare with the normal approximation? The standard error of v'ft is 
V 6/300 « 0.1414. Our deviation from the origin V*ft - Vft - 0 is 0.230 
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or 1.626 standard error units; from the normal table the area to the left of 
1.62 6 gv^ is 0.0520. Thus at this sample size and at this probability level the 
normal approximation to the distribution of ^ bi results in a slight overesti- 
mate of the proportion of samples with to the left of —0.230. 

^ bi distribution & 2 distribution 



Similar considerations hold for 62, which is a test for flatness. For n = 300, 
62 = 2.59 (less peaked than the normal curve) is significantly different from 
3.00 at the 5 per cent level (i.e., there is 1 chance in 20 that a sample of 300 
observations drawn at random from a normal population (0 2 ~ 3) would 
have a value of b of 2.59 or less). Judged by the normal approximation 
62 - 2.59 is not quite significantly different from 3, for the area to the left of 
2.59 will be found to be greater than 0.05. 

In both instances the normal approximations lead to an overestimate of 
the proportion of large differences; in the case of 62 the error is likely to 
be more serious, for while the Geary-Pearson approximation to the distribu- 
tion of V5I is quite similar to a normal curve, the b» approximation differs 
appreciably. 

In a normal population and $2 are independent of each other; hence 
they constitute independent tests of normality and for a large sample to be 
considered normal, both should be satisfied. 

1.32 Outline of a derivation of the normal distribution. The importance 
of the normal distribution in sampling theory is evident. This distribution may 
originate in the following way: if a large number of independent causes, each 
producing a slight effect, affect a quality characteristic, values of the latter 
will, under certain conditions, be normally distributed. A derivation from 
Whittaker and Robinson (47) will be outlined. 

The strength of cap screws varies from one screw to another. In other 
words, each shows a deviation from the average. This deviation will be 
assumed to be the effect of a large number of small deviations, the latter caused 
by the operation of a large number of independent causes, each of which has 
but a small effect. 

Let the small deviations be 

di , 4 * • , dn 
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with the total effect or total deviation being 


di + • • * + d n 

or more generally 

[7] Widi + • * • + W ndn 

where W% are weights. What is the probability that for a given observation 

(strength of cap screw) this deviation 
will lie between ai and < 22 ? The prob- 
ability that d r lies between x and 
x + dx is 

(p r (x) dx 

~x The probability that d T lies between 
di and d + ddi = <pi(di)ddi; and the 
probability that dr lies between d 2 and d + dd 2 == (p 2 (d 2 )dd 2 , etc. 

The probability of the concurrence of these deviations, if they are inde- 
pendent, is given by 

[ 8 ] <Pi(d{)<p2(d2) • • • <Pn(dn)ddidd 2 • • • dd n 

Therefore the probability that [7] lies between ai and a 2 is the integral of 
[8] over all values satisfying 

ai < W\d\ + • • • + Wndn < <%2 
The integration leads to 

0 ( x ) » -L C e i^-(e2/2i)i2+(^ s /S!)j s +. . . 39 

where the semi-invariants h y h, * * • ,I n are simple functions of the moments 
j> 2 , v*, * * •, Vn* If h is finite and if most of the deviations di • • • d n are of the 
same order of magnitude, the higher semi-invariants I 3 * • • I n will generally 
be small in comparison with 72 . Hence 

— f )h d Q 

2tT 1 / —65 

- ; 1 e -*m 

V&rfl 

1 

W2r 

1.33 Normality of the distribution of means. Various investigations indi- 
cate that the distribution of means of random samples is approximately 
normal even when the samples are drawn from decidedly non-normal popu- 
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lations. For example, Shewhart (36) obtains the following striking results 
from drawing 1000 samples each of 4 items from rectangular and triangular 
populations. Normal curves have been fitted to the distributions of means. 



Population --2.4 — 1.2 0 1.2 2.4, 

(Eectangular.population X 



Carver’s students (11) have considered a population of the following non- 

normal character. 

X 

Frequency 

3 

2 

16 

9 

29 

43 

406 

189 

1710 

37 

They found the distribution of 1000 means of random samples each of size 25 

to be 

Z 

Frequency 

200 — 

2 

280 — 

54 

360 — 

203 

440 — 

310 

520 — 

254 

600 — 

130 

680 — 

36 

760 — 

9 

840 — 

2 

1000 
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Carver concluded from this and other results that if the sample size is 50 or 
more and the parent population is at least 10 times as large as the sample, the 
shape of the parent population has relatively slight control over the shape 
of the curve of means. 

Exact and near-exact distributions of means have been found for various 
specific, non-normal populations and, in nearly all cases, the approach of the 
means to normality, even for low n, is evident. Thus, for a rectangular popu- 
lation Rietz and separately Irwin have found that the distribution of means 
rapidly approaches normality. This agrees with Shewhart's experimental 
results which have already been mentioned. The distribution of the means 
of samples drawn from a moderately skewed population known as Pearson's 
Type III has been found separately by Irwin, Church, and C. C. Craig. The 
result is another Type III distribution which rapidly approaches normality 
even for n < 50. 

Using Craig's methods, Ness (29) found similar results for another non- 
normal population, Pearson's Type X. Baker and later Craig found dis- 
tributions from still other non-normal populations; their results support 
the opinions stated above. The extensive literature on sampling from non- 
normal populations has been summarized by Rietz (35) and by Rider (34, b) ; 
their articles contain references to the mathematical work cited above with 
the exception of the unpublished results of Ness. 

The proof of the normality of the curve of means when the samples are 
drawn from a normal population and the proof that the variance of the mean 
is given by cr 2 /n will now be given. 

1.34 Normality of the mean and the difference of two means. We first 
show that if x and y are independent and normally distributed about means 
of zero with variances respectively of c y\ and < rj, then x + y (or x — y) is nor- 
mally distributed with zero mean and with variance + c r*. 

A procedure due to Jackson (23) will be used. 

Given <p(x,y), the frequency function for the joint distribution of x and y. 
To find v(u) where u = x + y. 

The frequency function of a single variable may be found by integrating 
the joint frequency function over all possible values of the other variable. 
Thus 



<p(x,y)dy 



is the frequency function of the variable u ~ z + y. 
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If x and y are normally distributed about zero, i.e., if 
<pi(z) - cie~ a ^, <f2 (y) = cze-W 


1 * 1 

a = — v b 

2 cr 2 


^ 2<r; 

we have, for a; and y independent, 

<p(z>y) = ciCze-^-W 


% 


or 


Write 




and 


We have 


^(u) = CiCz J* e a *?-b(u~x)3 fa 

+ b(u - x) 2 ^ (a + b)(x 7-7 u \ -f — v? 

\ a, -f* b / ct-rd 


6 ab 

x — u = v, c = — — 

o + 6 o + 6 


^(m) = cic»e _c " 5 f e~< a+l) * dv 

*/ — 03 


The integration yields a constant multiplied by the entire area under the 
normal probability distribution (unity). 

$(u) = Coe -** 2 


- c 0 e 


.M. u 2 _ H? 

° +& - - *«+-© 




which proves the theorem. The result for x — y is the same. 

The theorem may easily be generalized to n variables. 

If av * *, %n ure independent and form a random sample from a normal popu- 
lation of variance cr 2 , then xi + • * * + x n is normally distributed with variance 
no* 2 . Also 

x% + #2 *4* * • • + x n 
n 

will be normally distributed with variance <r 2 /n. 

To prove the normality of the sum and of the mean of observations, write 

%i + + $3 *» (#i + Xz) 4- ^3 
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Now the already proved theorem on the normality of the sum of two vari- 
ables will apply to (xi + X 2 ) and x$. The extension to the sum or the mean 
of n variables is obvious. 

1.35 Variance of the mean and the difference of means. As for the vari- 
ances of the sum of n variables and of their mean it may be well to show that 
these relationships (and those already found on variances) are independent of 
theorems on normality. Let the variances of X and Y be respectively and 
<r% where, for large samples 

_2 £/*(X - x)\ 2 ZMY ~ ?) 2 

Gg t Gy " 


f £ and f v being the frequencies of X and Y respectively. Frequently we write 
Y\(X — X ) 2 

the variances in the form * , the frequency f x being implied though 


not explicitly introduced. 

For continuous populations with means X and Y set equal to zero, these 
definitions are 


4 = J* x*cp(x)dx al « J y 2 4 / (y)dy 


where x - X — J? and y = Y — P. 

By definition of the variance we have for the variance of x + y where x and 
y are independent 


gUv « r r [( * + y) ~ w<p( x WM dx d y 

— cot/ — co 

r»co /%co A® A® 

- / / x 2 p(x)^/(y)dx dy + I I y‘ t <p(x)\l/(y)dx dy 

t/ ~00t/ — CO 1/ — cot/ — CO 

pa> /% co 

+ 2 1 I xy<p(x)ip(y)dx dy 

%J — co*/ — OS 


But 


Therefore 


X * /»* /*“ /»• 

I xy<p(x)ip(y)dx dy - I x<p(x)dx I yp(y)dy =* 0 

■ CO t/ — CO */-»<D */ — 00 

►re 

X » /*» A® A® 

3?<p(x)dx I 4>(v)dy + / yH / {y)dy I <p(x)dx 

•00 */ 00 t/-«l */ — .CO 
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Note that 


2 2 

Vz-v Cz+V 


These results have already been reached in the special case of normal x 
and normal y . If x and y are from a population of variance <r 2 , 


<?z+v * cr 2 + a 2 


or for ft variables 


cr 8UIQ = ncr 2 

For the variance of the mean of x + y 


2 

V(.Z+v)/ 2 


/T»CO /fcOO / 

V — COl/ — co \ 

2 , 2 
jh Py 

4 


- — 0^ <p(x)ip(y)dx dy 


For the case in which x and y are from a population of variance a 1 

,if2 

^ C®+»)/2 


£1 

2 


For the mean of n variables from a population of variance a 1 


U'mean 


4+_ 


+ £xn = Hjf 

„2 


ft* ft- 4 ft 

which was to be shown. 

1.36 Mean estimate of a 2 . Given a population of kn observations divided 
so that there will be k samples with n observations in each sample, k being very 
large. We wish to form an estimate of the population variance <r 2 , the un- 
known true value of <r 2 being 

ZE(X - Xj 


kn 


Write 
« 


7\* 


Eo~ - jy E(* zJL+JLz El 

n n 


£(J - X)- + gcg - ZQ 5 + 2E(X - 1)(* - XQ 


n 


n 

The cross-product term is zero, for X — X f is a constant and — %} 
is zero. Writing the variance of the sample as 

n 
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we have 


£g._gg -»■ + <?- jv 


If this expression is summed over k samples and the result divided by k, 
we obtain 

I ;s(x - r) 2 = ii 2 + S(x - ry 


nk 


Write s 2 for the mean sample variance E$ 2 /&. Now 

TJX - X ') 2 

k 

is seen to be the variance of the mean and is therefore equal to <r 2 /n. We have 

2 ~2 « ff2 

<r = s 2 H 

n 


i.e., the “ mean ” estimate of a 2 is 



If gt 2 must be estimated from the data of a single sample, the estimate is 



which is equivalent to 


Z(X - X ) 2 

71—1 


This estimate <r 2 is thus shown to be the mean (unbiased) estimate of (r 2 . 
It is also “ best ” in the sense that the variance of o- 2 is a minimum, but this 


we do not show. It may be noted that while 


E(x - ff) 2 


is the mean esti- 


mate of 




E(X - x ) 2 


is not the mean estimate of <r: this fact is unim- 


n — 1 

portant relative to the tests of significance used in this chapter. 

1.37 Nature of the t test It was shown that if we have a normal popula- 
tion of mean X f and Variance cr 2 the means of random samples of size n are 
normally distributed with mean X = X f and variance <? 2 /n. Thus, if we 
reduce any deviation, say X — X' to standard error units by forming 

x - T %~T 
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the probability of exceeding u is given by 


1 /»" 

vsj. 

Values of this probability integral may be found by subtracting the entries in 
Table IV from 0.5. 

If is unknown, the best estimate of <r 2 is 


a 2 = 



s 2 


but 

t » = ILzJP 

a/'Sn n — 1 

is not normally distributed, particularly not for small sample sizes, for when 
n is small, the standard deviation $ varies considerably from sample to sample. 
The variability of s was discussed by 
earlier writers but to u Student ” (39) 
belongs the credit both for recognizing 
the practical importance of the problem 
and for an approximately correct solution 
to the problem of the distribution of t 

To find the distribution of t, " Student ” 
first found, by approximate methods, the _ 
distribution of s 2 . He began by finding 0 
the first four moments of the distribution 
of s 2 in terms of the second moment a 2 of the normal parent population. 

The moments Mh of s 2 about the left end of the range ($ 2 = 0) are found 
from simple expansions which yield, for examples 

f- 0> 

k \ w / 

k n 

jg _ S£(s 2 ~ Q) 2 _ ^ (n - l)(n + I) 

Similar expressions may be found for the third and fourth moments M* 
and Mi in terms of <r 2 . These expressions are easily transformed to moments 
about the mean of s z , and from these statistics the values of v'bi and fa are 
computed. The values of 'Sfa and fa indicate that a Pearson Type III curve 
will fit the distribution of s 2 , from which the ordinate of the distribution of s 2 
is found to be 



y* m 
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and the ordinate of the distribution of 5, 

y 9 = C2S n '" 2 e~ Mil2a!i 


d and C 2 being constants. 

“ Student ” then partially proved that X and s 2 were independent of each 
other. Thus, knowing the mean to be normally distributed and the standard 
deviation to be distributed as given above, he found the ordinate of the dis- 
tribution of the ratio t to be 


yt « [<p(n)] • 


I 





a distribution whichAs symmetrical about t — 0, which ismore peaked than the 
normal curve but which approaches normality as n becomes large. <p(n) is 
known. Table V gives values of 

2 jT ydt 


R. A. Fisher (15, a) later gave an exact proof of the distribution of t and of 
the complete independence of the mean and the variance of random samples 
drawn from a normal population. 

1.38 Mean estimate of cr 2 from two small samples. If the quantity 

£(X - X) 2 + Z(F - Y ) 2 
nx + nr — 2 


is summed over a large number, k , of samples and the sum divided by h, the 
resulting mean or “ expected ” value will be found to be equal to <r 2 . This is 
easily demonstrated if we make use of three elementary properties of E(X) > 
the expected value of a variable X . 

E(X) = mean X 


E{cX) - c mean X, where c is a constant 


E(X +~Y)- E(X) + EiY) 


We have 

E [ SC? - *?» ± S(F - F)»~ | _ E r n*4 ± nA 1 
L »x + tty-~ 2 J [_Mjr + «y — 2J 1 

_ — - lnx®(«x) + »yE(«y)] 

n x + n y - 2 

Bat 

E($b - an d E(4) = 

n X fly 
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whence 


E p£(x - 1) 2 + Z(y - fy 

nx + n y — 2 


]- 


The mean value of the given function is a 2 . Hence the function is an 
unbiased estimate of c r 2 . As in the case of estimating or 2 from a single 
sample, the mean value given above is the “optimum’ 7 or best estimate of <r 2 , 
in the sense that it has minimum variance. 

1.39 “ Students ” t applied to the difference of means. The application 
of the distribution of i to problems involving the difference of the means of 
two small samples arises from the fact that t is essentially the ratio of a nor- 
mally distributed variable X to an independent estimate of the standard error 
of X. Write 


* 2 = 



„ , . A f lL(x ~ x ) 2 

Replacing <r by + 

\ n — 1 

cr 2 we obtain 


and dividing numerator and denominator by 


t 2 = 



X-X' 

The numerator j=r is normally distributed about zero with unit standard 

<r/ V n 

deviation; the denominator £ 

and of the number of independent observations on which the estimate of g 2 
is based and its distribution is known. R. A. Fisher (15, a) was the first to 
note that any statistic which could be expressed as the ratio of a normally 
distributed variable to the square root of such an independently distributed 
estimate of the variance of that variable would be distributed as t with degrees 
of freedom equal to the number of independent observations from which the 
estimate of the variance was made. 

This condition is satisfied in a difference of means test. If from normal 
populations of means X' and f ' and variance cr 2 we draw two random samples 
of sizes nx and ny and means X andJP (optimum estimates of X' and T'), 
we already know that frequencies of X and Y are distributed normally about 
2* and V with respective variances of g 2 /ux and g 2 /tiy* We have shown 
that X — ? is normally distributed about X' - Y f with variance 




is a function both of ]C(X — ^) 2 
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The unbiased (and optimum) estimate of <r 2 is 

£( X - Z ) 2 + £( T -?) 2 
n x + n Y ~ 2 

which is independent of X — P. Hence 

(X ~ P) ~ CF - ?') . 

/ L(X - X) 2 + Z(V - P) 2 nx + ny 
\ ftx + — 2 nxttr 

is distributed as t with n\ + — 2 degrees of freedom. In our examples 

and in general we are attempting to infer whether or not X' — P\ Thus if 
X' — F' = 0 lies beyond the 5 per cent level of t we conclude that the optimum 
estimates X and ? are significantly different, i.e., X' Y'. 

1.40 Correlation and the t test. Given 

Xi Yi 

x 2 y 2 


Xn 


Yn 


We have tested the difference of means of small samples of X and Y in two 
ways. With unpaired variates we computed 



which is distributed as t with 2n — 2 degrees of freedom. With paired 
variates, we formed 

Xi - Fi « di 

X 2 - F 2 - d 2 


and then computed 


and 


Xn Yn — dn 


x ~ y = a 
a a 


which is distributed as t with n — 1 degrees of freedom. 
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Certain features of the two situations are brought out in the following ex- 
ample. From two normal populations we draw the two following samples : 


X 

6 

4 

8 

3 

9 

1 = 6 


7 

7 

5 

9 

4 

10 

7 = 7 


We find 


1-7 


X - Y 


L 

ynx ny V 


f E(X - l) 2 + £(7 - 7] 
n x + Tiy — 2 


: (- +i ) 

\n x n Y f 


-1 


4 


= -0.620 


26 + 26 . 2 

8 5 


which for eight degrees of freedom yields P = 0.55; the difference is not 
significant. 

If we apply the same method to the following data, X and 7 having the 
same variates as in the preceding case, 

X Y 

6 7 

4 4 

8 10 

3 9 

9 5 

1 = 6 7 = 7 


we obtain exactly the same result. But the two sets of data differ strikingly. 
In the first set, whenever X is greater than 1, the 7 paired with that X is 
greater than 7 and whenever X is less than 1, the paired 7 is less than 7. 
In the second set of data, on the other hand, there appears to b&little correla- 
tion between X and 7; for example, when X is greater than 1, 7 is in one 
ease greater than 7 (X - 8, 7 = 10) whereas in another case 7 is less 
than P (X = 9, 7 = 5). 

"Now consider the value of the correlation coefficient r where 

_ L(X - X)(7 - P) 
n$ x s Y 

for both of the above cases. The value of the numerator of r varies from - » 
to +oo with the amount and nature (negative and positive) of the correlation 
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between X and 7. The effect of the denominator is to reduce this variation 
to the range —1 to +1 and to eliminate the effect on r of the units in which 
X and 7 happen to be expressed. If the relationship between X and 7 can be 
described perfectly by the linear regression function Y r = a + bX (see 
Ch. IV), then r = +1. Similarly, if the deviations X — X and 7 — 7 are 
independent of each other, i.e., if there is no correlation, we have r = 0. 

These are only a few of the important properties of r, but for our present 
purpose, no other properties will be needed. 

For the first set of data r = +1, whereas in the second set r = 0. 

The first difference of means test clearly does not distinguish between eases 
in which X and 7 are correlated and those in which no correlation is present. 

If the test 

3 

<ra 


is applied, we obtain for the first set of data 

d 

-1 

-1 

-1 

-1 

3 = - 1 


or t is infinite, i.e., the mean difference d = — 1 is certainly significant, 
the second set, in which r = 0, we find 


' / £(<* - a ) 2 
V (n - 1 )n 



-0.620 


For 


exactly as before, but now with four instead of eight degrees of freedom. 

In the case of positively correlated X and 7, elimination of the correlation 
by forming differences showed that the mean difference of —1 was highly 
significant; on the other hand, in the case of uncorrelated X and 7 the same 
test was less sensitive than the ordinary difference of means test, for we 
obtained the same value of t with a loss of four degrees of freedom. 

We may note that the ordinary difference of means test may be modified 
so that it is equivalent to the second test. In place of 

L(X - xi + E(f - f) 2 

2(n — 2) 

as an estimate of <r 2 in the original test, we use 
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an estimate of cr 2 based on n — 1 degrees of freedom for the cross product 
reduces the number of independent observations by n — 1. For the first set 
of data, £(X — Z)(F — F) =26, we would obtain 

_ 26 + 26 — 2(26) 2 „ 

8 5 


or t is infinite for four degrees of freedom, which corresponds to the result 
obtained when the correlation was eliminated by the alternative method of 
forming X % — Yi. 

It is not possible to say, in advance of actual trial, which of the two tests 



2n — 2 degrees of freedom 


n — 1 degrees of freedom 


will be the more sensitive in a paired experiment. If the variables X and Y 
are positively correlated, either forming differences or using [9] as an estimate 
of cr 2 in the ordinary difference of means test will reduce the variance of (and 
hence increase the significance of) the difference X — F. This gain may be 
nullified by the loss of half of the original number of independent observations. 
In our examples this loss (from eight to four degrees of freedom) is of no 
importance because cr 2 declined from 26 to 0; this is, of course, an extreme 
example. If the variates are unpaired in the original experiment, the second 
method is not available. 

E. A. Fisher (15, a) has summed up the matter in the following sentence: 
“ When both methods are available, sometimes the one and sometimes the 
other is the more sensitive; if either shows a significant deviation, its testi- 
mony cannot be ignored.” 



CHAPTER XI 

DIFFERENCES AMONG SEVERAL MEANS 

2.1 Example of a simple experimental arrangement. An industrial 
experimenter wishes to compare the effects of five types of grids. A, B, 
C, D, and E, on the vacuum of radio tubes. With each type of grid he 
uses five tubes. The results, expressed in terms of a measure of vacuum, 
follow 


A 

B 

C 

D 

E 

93.6 

95.3 

94.5 

96.8 

94.6 

95.3 

96.9 

97.0 

98.2 

97.8 

97.0 

1 95.8 

97.8 

97.2 


93.7 

97.3 

97.0 

97.2 

95.0 

98.0 

97.7 

98.3 

97.9 

98.9 


2.2 General nature of the analysis. Even if the five types of grids 
are the same, the five column means, i.e., grid means, are not likely to 
be identical. For if from five populations (which we shall assume to be 
normal) which are alike in their means as well as in their variances, five 
random samples each of five observations are drawn, the five sample 
means will differ among themselves by chance. Our problem is to 
determine whether or not the observed variation in column means can 
be so explained. If it cannot, the hypothesis that the five normal popu- 
lations are alike in their means and variances is rejected. Then, if it can 
be shown that the data do not refute those parts of the hypothesis 
covering normality and equality of variances, it will be concluded that 
the means of the five populations differ significantly among themselves, 
Le., the five types of grids differ significantly, in a statistical sense, in 
their effects on vacuum. 

If the five types of grids are alike in their effects on vacuum, the 
column means will vary about their mean by an amount which is deter- 
minable from the variation of the individual observations in the columns 
about their respective column means. For if the only unidentifiable 
factor (differences among grids) is without effect, both variations among 
column means and within columns are allocable to the same host of 
unidentifiable factors. Notice that it is not stated that, if grids are 
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alike in their effects, the variation among column means will be equal 
to that within columns for, though both variations are caused by the 
same forces, means will practically always vary less than the individual 
observations of which they are formed. 

If differences among grids really affect vacuum, this calculable rela- 
tionship between variation among column means and variation within 
columns does not exist. For while variation within columns is still 
caused only by unidentifiable causes, variation among column means is 
now attributable to these factors and to real differences among grids. 
This brings us to the nature of the test of significance, later to be called 
the F test. First, interpreting the unallocable variation within columns 
as the error of the experiment, we can set limits on the amount of varia- 
tion that w r ould be expected among the column means — if the same 
host of unidentifiable factors affect both. If the observed variation 
among means is outside these limits, the hypothesis that the grids are 
without effect is rejected. 

2.3 Randomization of factors. In the experimental arrangement 
described in 2.1, all factors which might affect vacuum (other than 
grids) must be allocated at random. For example, assume that several 
sealing machines are used. If all tubes with grid A are sealed by the 
first machine and all tubes with grid R are sealed by the second machine, 
any conclusions regarding the effect on vacuum of differences among grids 
are vitiated, for the observed differences among column means are allocable 
to machines and/or to grids. Such vitiation can be precluded by assign- 
ing machines to grids at random. This applies to all influential factors. 

The experimental arrangement shown in 2.1 will be called a com- 
pletely randomized arrangement. 

2.4 Magnitude of the error. The error of the completely random- 
ized experiment can be taken to consist of variation in vacuum unex- 
plained by differences among the grids. This variation is made up of 
the effects of differences among sealing machines, personnel, etc., and is 
directly measured by the variation of observations within columns, for 
such variation does not involve differences among grids. Thus, if 
several sealing machines are used on the five tubes containing grid 
type A, the variation of the observations in the first column about the 
mean of the first column is partly the result of differences among 
machines. If the operators used on the machines are of different skills, 
the result will be still greater variation among the observations on 
vacuum within each column, that is, still larger experimental error. 

2.5 Complete control. Experimental error can always be reduced 
hj holding constant all factors except the one under investigation. 
If only one sealing machine is used for all 25 tubes, differences among 
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sealing machines no longer contribute to the error of the experiment. 
And if only one operator is used, the same is true of differences among 
operators. 

It is often impossible to achieve, simultaneously, complete control 
over all influential factors. For example, in an experiment dealing 
with variation in the quality of yarn, the use of a single loom would 
prolong the experiment over many weeks and the experimental error, 
decreased by the absence of loom differences, would be increased by the 
influence of factors which change with the passage of time, such as 
workroom humidity, operator efficiency, etc. If this source of varia- 
tion is reduced by conducting the experiment in a single week, many 
looms will be required and loom differences reenter. In any case, the 
types of experimental arrangements we shall now describe obviate the 
need for complete control; in addition, they can often be made to yield 
valuable information which cannot be obtained from completely 
controlled experiments. 

2.6 Latin Square. The Latin Square is an arrangement which 
permits at least two factors (other than the one being studied) to vary 
during the experiment, and yet it excludes the principal component of 
their variation from the error of the experiment. Assume that sealing 
machines and operators are two factors which might affect vacuum. If 
five grids are to be compared, the Latin Square arrangement requires 
five machines and five operators. The machines and operators are 
allocated to grids (A, B, C, D, and E) in such a way that the separate 
grids, machines, and operators are associated in the same trio only once. 



Machine 

1 

2 

3 

4 

5 

Operator 

i 

E 

B 

D 

A 

C 

2 

C 

D 

B 

E 

A 

3 

A 

C 

E 

B 

D 

4 

* 

D 

E 

A 

C 

B 

5 

B 

A | 

C 

D 

E 


In order to appreciate the merits of this arrangement, consider the 
completely randomized experiment. In that experiment, two types of 
variation were noted, namely, variation among the grid means and the 
unallocable variation of the individual observations about their respec- 
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tive grid means, i.e., variation within grids. There is no other source of 
variation in that experiment; the total variation, i.e., the variation of 
the 25 observations about the grand mean is made up of these two 
variations. 

Now assume that the earlier data were obtained from the following 
Latin Square arrangement. 



Machine 

1 

2 

3 

4 

5 

Operator 

1 

E 

98.0 

B 

95.8 

D 

97.2 

A 

97.0 

C 

97.8 

2 

C 

98.3 

D 

97.9 

B 

97.7 

E 

98.9 

A 

98.0 

3 

A 

93.6 

C 

94.5 

E 

94.6 

B 

95.3 

D 

96.8 

4 

D 

97.2 

E 

95.0 

A 

93.7 

C 

97.0 

B 

97.3 

5 

B 

96.9 

A 

95.3 

c 

97.0 

D 

98.2 

E 

97.8 


The total variation is the same as before. The grid means are 
unchanged; hence the variation among grids is the same as before. 
If we subtract variation among grids from the total variation, we 
obtain a term, say b, which must be numerically equal to the error 
term in the randomized arrangement, i.e., the term called variation 
within grids. But while the latter represented variation unallocable to 
any specific factor or factors, the b of the Latin Square can be divided 
into three parts, two of which are allocable to specific factors and the 
third of which is unallocable, i.e., unidentifiable. 

In the present example the two new identifiable factors are machines 
and operators. The variation due to differences among machines (vari- 
ation amon g column means), which contributed heavily to the error 
of the completely randomized experiment, is removable from the error 
of the present experiment; for inasmuch as each grid and each operator 
have been used an equal number of times (once) with each machine, 
the removal of the effect of machine differences cannot vitiate the com- 
parison of grids (or of operators). In fact, in a Latin Square it is not 
possible to attribute the mean effect of any one factor to either or both 
of the renaming two factors. The effects of the three factors are com- 



56 


INDUSTRIAL STATISTICS 


pletely separated; each effect is measurable and removable without 
interference with the others. 

In the completely randomized experiment, it was not possible to 
remove mac hin e effects for, first, there was no stated record as to which 
machine was used with each grid and operator, and second, even if 
there were such a record, it is unlikely unless deliberately planned that 
just five machines would be used and that each would be used exactly 
once with each grid and each operator. If these conditions are not 
satisfied, machine effects cannot be removed. For example, assume 
that the completely randomized experiment was conducted as follows 
(the numbers in parentheses refer to the different machines) : 


Grid 


A 

B 

c 

D 

E 

93.6 

a) 

95.3 

(2) 

94.5 

(5) 

96.8 

(3) 

94.6 

(5) 

95.3 

a) 

96.9 

(4) 

97.0 

(1) 

98.2 

(3) 

97.8 

(5) 

97.0 

(3) 

95 8 

(4) 

97.8 

(1) 

97.2 

(2) 

98.0 

(5) 

93.7 

(3) 

97.3 

(2) 

97.0 

(1) 

97.2 

(4) 

95.0 

(4) 

98.0 

(3) 

97.7 

(2) 

98.3 

(5) 

97.9 

(2) 

98.9 

(4) 


Exactly five machines were used, but the machine effect cannot be 
removed for such a step would in part remove any effect of grids. For 
example, the difference in the means of the first and second machines is 
especially entwined with the differences of grids B and C. 

Returning to the Latin Square, it is apparent that if the effects of 
machine and operator differences are statistically significant the experi- 
mental error of the square (error in the sense of unexplainable variation) 
will be less than that of the completely randomized arrangement. In 
the notation of the following table 63 will be less than 6. 


Comparable Indexes op Variation 


Completely randomized experiment 

j Latin Square 

Variation among grids (a) 

Variation within grids (6) 

Variation among grids (a) 
Variation among machines 61I 
Variation among operators 62 
Unallocable variation 63 j 

|<» 

Total variation (c) 

Total variation (c) 


2.7 Size of a Latin Square. It is disadvantageous to use many 
machines and many operators for the Latin Square excludes only the 
variation among the row and column means from experimental error. 
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If many machines are used, the error will be large even if machine means 
are identical. The same is true for operators. This limit on the num- 
ber of machines and operators automatically limits the number of types 
of grids which can be compared in a single square. A 10 by 10 square 
may be taken as the maximum. 

A small Latin Square is unreliable for while a Square of any size tends 
to reduce experimental error, the error of our estimate of the true 
value of that error from, say, the nine observations in a 3 by 3 Square 
is high. If as few as three or four grids are to be compared, more than 
one Latin Square must be used. 

2.8 Other considerations. In the Latin Square arrangement, more 
than two influential factors can be admitted and their effects on error 
can be excluded. Assume that along with machine and operator differ- 
ences it is believed that workroom humidity at the time of sealing affects 
vacuum. The various humidities that actually occur may be divided 
into five classes; the first machine is used only at the lowest humidity, 
the second only at another humidity, etc. What was in the original 
Latin Square variation allocable to machines is now variation allocable 
to machines or humidity or both. As we no longer know exactly what 
causes this variation, there has been some loss of information. How- 
ever, if the exclusive purpose of the experiment is to distinguish among 
grids, this loss is of no importance and there may be a gain in the form 
of a further reduction of error by the exclusion of the effects of a 
third factor, humidity. 

Superior arrangements for handling more than two factors, such as 
Graeco-Latin Squares, will not be discussed here. 

The Latin Square is an experimental arrangement in which the alloca- 
tion of machines and operators is subject to a double restriction: Each 
machine must be used once with each grid and each operator; also each 
operator must be used once with each grid and eacLmachine. Many 
Squares satisfy these requirements (for example, the row T s in a Latin 
Square may be interchanged). The Squares actually used can be 
selected at random by many card-drawing schemes, which the reader 
can easily arrange for himself. 

2.9 Randomized blocks. The Latin Square arrangement excludes 
the effect of at least tw f o of the factors, say machines and personnel, from 
the unexplained variation to wiiieh differences in the third factor, grids, 
are compared. Assume that it w r as known that one of the two factors 
was without effect, for example, that sealing machines do not differ 
among themselves in their effect on vacuum. Only the effect of differ- 
ences among operators need be excluded, and the following plan, known 
to agronomists as a randomized block arrangement, is appropriate. 
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Operator 


1 

2 

3 

4 

5 

95.8 

(B) 


(A) 

96.8 

(D) 

95.0 

(E) 

95.3 

(A) 

97.0 

(A) 

98.3 

(C) 

94.6 

(E) 

97.3 

(B) 

97.0 

(C) 

97.8 

(C) 

97.9 

(D) 

93.6 

(A) 

93.7 

(A) 

96.9 

(B) 

97.2 

(D) 

97.7 

(B) 

95.3 

(B) 

97.0 

(C) 

97.8 

(E) 

98.0 

(E) 

98.9 

(E) 

94.5 

(C) 

97.2 

(D) 

98.2 

(D) 


Differences among the means of the blocks, l.e., among the operator 
means, can be removed for each type of grid (the only factor remaining) 
as each type of grid is represented once in each block. Similarly, grid 
differences can be removed, for each operator is represented once with 
each type of grid. Other factors, such as machines and humidity, are 
handled as in 2.8 or are allocated to grids strictly at random in order to 
prevent possible vitiation. 

2.10 Analysis of variance in a completely randomized experiment. 

We shall now consider the method of analysis to be applied to the com- 
pletely randomized experiment. If vacuum is unaffected by grid 
differences, any variation among the five grid means is caused by the 
same unidentifiable factors that cause variation of the individual obser- 
vation “ within ” each grid. Now conceive of the observations within 
the first column as having been drawn from one normal population, the 
observations within the second column from a second normal popula- 
tion, and so on, and the grid means as having been formed from samples 
whose items were drawn from a sixth normal population. If these 
normal populations are identical, the six estimates of their common 
variance tend to equality. All within-column variation is, however, of 
the same nature, so we shall reduce the number of estimates to one 
pooled within-column estimate and one estimate based on variation 
among means. 

How are these estimates formed? If we pool the within-grid varia- 
tions for two columns (grids), an unbiased estimate of the population 
variance c 2 is 

gjgi - £i) 2 + Z(X 2 - X 2 ) a 

Ml -f- 712 — 2 

as was proved in 1.38. The proof given there is easily extended to 
k columns; the unbiassed estimate of o- 2 formed from the pooled within- 
column. variation of all k columns is 

m _ Xgi - *i) 8 + Z(X 2 - £a) 2 + ••• + XCjft - x k f 
1 % + + * • • + n& — k 
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In the notation used here X\ is summed over the ii\ values of X in the 
first column, X% is summed over the n 2 values of Z 2 in the second column, 
etc. In our example, = n 2 = * * • = = 5 and k — 5. 

Now consider the estimate of a 2 formed from the variation among 
column means. The k means X h X 2 , • * •, X k constitute a sample of k 
variates; an unbiased estimate of the variance of the normal popula- 
tion of such variates, i.e., of means , is given by 

[21 jglg 

k - l 


But these variates are means, not individual observations, and it 
would be incorrect to expect [2] to be equal to [1]. [2] is an estimate 

of the variance of a population of means whose variance in terms of the 
variance of the individual observations has already been shown to be 
g^/uc where n c is the number of observations from winch each mean is 
formed (in the present example, n c = 5). Hence 

m ->2 _ TMjc ~ X ) 2 


is an unbiased estimate of a 2 . 

A third unbiased estimate of <j 2 is obtained from all n observations 
taken together. This estimate is 


[4] 


. 2 £(X - X) 2 

£3 — : 


It will be noted that the numerator of each estimate contains as many 
terms as there are observations in the experiment, (n). This is immedi- 
ately apparent in [1] and [4] while in [3] there are k terms (X c — X) 2 
and each is weighted by the number of observations n c in the corre- 
sponding column, i.e., a total of n terms. It will simplify our termi- 
nology if we take advantage of this fact and understand the summation 
sign to include n terms. Thus [1] will henceforth be written 

L(X - l c ) 2 

n — k 

and [3] will be replaced by 

E (£ c -£) 2 


k - 1 



60 


INDUSTRIAL STATISTICS 


The results to this point can be summarized in the following table: 


Source of 

Sum of 

Degrees of 

Mean 

variation 

squares 

freedom 

square 

Among k columns 

£(X C - X) 2 

k-l 

£(^-z)» 

Within k columns 

£(Z-X 0 ) 2 

n — k 

. 2 £(x-X c ) 2 

( 72 n — k 

Total 

£ (X - Z) 2 

n — 1 

£(X-X) 2 
n — 1 


Let X„- represent the ith variate in the jth column, and let X 3 repre- 
sent the mean of the jth. column. The sums of squares are 

2(1, - X ) 2 = (X x - X) 2 + (Xi - X ) 2 + • • • 

+ (Xi - X ) 2 (n 1 terms) 

+ (X 2 - X ) 2 + (X 2 - X ) 2 + • • • ' 

+ (X a - X ) 2 (t%2 terms) 

+ * * * 

+ (X, - X ) 2 + (X* - X ) 2 + • • • 

+ (Xi — X ) 2 (ftfc terms) 

£(X - Xe) 2 = (Xu - Xi ) 2 + (X 21 - Xo 2 + • • . 

+ (X„ji - Xx ) 2 

+ (Xi 2 - X 2 ) 2 + (X 22 - X 2 ) 2 + • • . 

+ (X^ 2 - X 2 ) 2 

+ • * * 

+ (Xu, - Xi) 2 + (X 2 i - Xi) 2 * . . . 

+ (X»* - Xi) 2 

£(X - X) 2 = (Xu - X ) 2 + (X 2 x - X) 2 + • • . 

+ (X Bl x - X ) 2 

+ (X 2 x — X ) 2 + (X 22 — X) 2 + • • • 

+ (X^ - X) 2 

+ (Xii - X) 2 + (X 26 - X) 2 + • . . 

+ (X» S i — X) 2 
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The most convenient forms for computing the various s ums of squares 
are the following 


Ed* - x ) 2 = Elf - 

n 

Ed- x c ) 2 = zx 2 - Edf 


where 


Ed - l ) 2 = El 2 - 

72 , 


rf2 (Eli ) 2 

° »i 


(ZX2) 2 , 

r * ’ * 

212 


(£1*) 2 

n& 


In the notes at the end of this chapter it is shown that the total sum 
of squares is equal to the sum of squares associated with among-column 
variation plus that associated with within-column variation. 

The degrees of freedom are also additive. The number of degrees of 
freedom associated with each estimate will be given by the number of 
variates in each summation less the number of constants (means) about 
which the deviations of the variates are taken. Thus, for total vari- 
ability, we have n values of X in 

Ed-1) 2 

less one mean, X (the mean of all the observations), or n — 1 degrees of 
freedom. For variability among columns, there are k values of in 

£(X C - X ) 2 

less one mean X or k — 1 degrees of freedom. For withm-column 
variability we have n values of X in 

Ed-le ) 2 

less k values of l c or n ~ k degrees of freedom. 

2.11 The F test. Now let us review the hypothesis to which this 
analysis is relevant. The hypothesis H states that the k column means 
arise from identical normal populations, i.e., normal populations of the 
same mean X' and the same variance a 2 . If the hypothesis is true, 
the es tima tes a 2 , of, and of should be the same, within the allowable 
range of sampling error. If, however, the ratio of say of to of is sig- 
nificantly different from unity, the hypothesis must be rejected, i.e., the 
columns do not come from normal populations of the same mean and the 
same variance. Now if the ratio of of to of differs significantly from 
unity while (1) the a and V&f tests support normality (or normality is 
assumed) and (2) the l\ test supports the hypothesis that all columns 
come from populations of the same variance it follows that the part of 
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H which is untenable is that the columns come from populations alike 
in their means ; i.e., the column or grid means differ significantly among 
themselves in their effect on vacuum. 

The value of the ratio F of any two estimates of a 2 will tend to unity 
as the number of independent variates on which each estimate is based 
is increased. The distribution of F for random samples drawn from 
normal populations is known, i.e., the probability of obtaining a value 
of F larger than any given value is known. We compute the ratio of an 
estimate associated with a suspected “ cause ” to the estimate which 
best defines the error of the experiment. If the probability is small, 
say less than 0.05, that this ratio could have occurred by chance in 
sampling from normal populations of identical means and variances, the 
hypothesis that such were the populations is rejected. 

The numerical analysis of the completely randomized experiment 
follows. 


Source of variation 

Sum of squares 

Degrees of freedom 

Mean square 

Among grids 

10.25 

4 

2.56 

Within grids (error) 

44.18 

20 

2.21 

Total 

54.43 

24 

.... 


» _ 2.56 
F ~23i 


1.16 


is based on 4 and 20 degrees of freedom. From Table VIII, F would 
have to be as large as 2.87 in order to overthrow the hypothesis. We 
conclude that grid differences are without effect on vacuum. 

2.12 A t test after an F test. Had the entire set of grids differed 
significantly among themselves, the following procedure could have 
been used to determine whether or not the apparent best and second 
best grids differ significantly between themselves. 

The estimate of the va rianc e of a single observation is 2.21. 
The standard deviation is V2.21 = 1.49. Each grid mean is based on 
five observations; the standard error of a grid mean is accordingly 
1.49/V5 = 0.6664. The difference of any two grid means has a standard 
error £3 

h = ftj- + - - 1.49 J§ = 0.9422 
\n n \5 

The difference of two means, to be significant, should exceed, say, 
2.0863* = 1.97,* the figure 2.086 being at the 5 per cent level of t for 
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20 degrees of freedom. The actual difference of the two best grids is 
only 0.54. In fact no two grid means differ in their means by as much 
as 1.97. 

If many grids are studied, two grid means may well differ “sig- 
nificantly ” even though the F test indicates over-all homogeneity. For 
even in a homogeneous set of means, the difference between say the 
largest and smallest means will likely appear to be “significant.” 
A t test applied to two means after over-all homogeneity has either 
/been refuted or not must be used with caution. 

' 2.13 Analysis of variance in a Latin Square. In the Latin Square 
let X represent an observation, X g a grid mean, X m a machine mean, 
X 0 an operator [mean, and] X (the grand mean. Let there be n obser- 
vations, G grids, M machines, and 0 operators (n = G 2 ). Four inde- 
pendent estimates of the variance of the population can be found, and 
they are listed in the following table. 


Source of variation 

Sum of squares 

Degrees of freedom 

Mean square 

Among grids 
Among machines 
Among operators 

Residual (error) 

EA -xy 
E(A m - xy 
£(.v 0 -xy 

U (obtained by 
subtraction) 

G - 1 

M - 1 

0-1 

V (obtained by 
subtraction) 

- xy/o-i 
L(Xm -xy/M-i 
Z(X,-xy/o-i 

u/v 

Total 

Zix-xy 

G 2 - 1 



For the data shown in 2.6 we have the table below. 


Source of variation 

Sum of squares 

Degrees of freedom 

Mean square 

Among grids 

10.25 

4 

2.56 

Among machines 

12.42 

4 

3.11 

Among operators 

29.59 

4 

7.40 

Residual (error) 

2.27 

12 

0.19 

Total 

54.53 

24 

.... 


Machines and operators account for a very large part of the variation 
which, in the completely randomized experiment, constituted error. 
We have, for grids 


F 


2.56 

0.19 


- 13.47 


which for 4 and 12 degrees of freedom is highly significant. The Latin 
Square arrangement shows that differences among the grids have a real 
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effect on vacuum, an effect which could not be dete rm ined from the 
completely randomized arrangement. 

Is the difference between the means of the two best grids statistically 
significant? The variance of a single observation in the Latin Square 
arrangement is only 0.19, as against 2.21 in th e com pletely randomized 
experiment. The standard deviation is Vo 19 = 0.436. We are 
interested in the standard error of the difference of two means, each 
based on five observations, and this is given by 

0.436 V£+ i. 


The observed difference in means of the two best grids is 0.54; or 1.96 

standard error units, for 



0.54 


0.436V? 


1.96 


This does not quite reach the 5 
per cent value of t for 12 degrees 
of freedom, which is 2.179. Hence 
the observed difference between the 
two best grids cannot be said to be statistically significant. The accom- 
panying graph illustrates this example; A + B — 0.05. 

2.14 Analysis of variance in randomized blocks. The randomized 
block analysis is given in the table below 7 : 


Source of variation 

Sum of squares 

Degrees of freedom 

Mean square 

Among grids 

10.25 

4 

2.56 

Among operators 

29,59 

4 

7.40 

Residual (error) 

14.69 

16 

0.92 

Total 

54.53 

24 

— 


The effect of grids is not quite significant. This experiment has rela- 
tive to grid differences a large experimental error (resulting from inclu- 
sion of machine effects in the error term), so that detection of differences 
in grid effects is not possible. 

2.15 Further examples. Tippett (42, b ) has described two textile 
experiments, one of which used randomized blocks and the other a Latin 
Square. The data from one of these experiments are shown immedi- 
ately below but for the moment are assumed to come from a completely 
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randomized arrangement. The experiment in question was designed to 
determine the effect of differences in roller weightings on the strength of 
yam. Three roller weightings A, B, and C were used and there were 
four strength tests for each weight. The quality of the yam is measured 
by the product of lea strength in pounds and count. 


A 

B 

C 

1577 

1535 

1592 

1690 

1640 

1652 

1800 

1783 

1810 

1642 

1621 

1663 


The analysis of variance follows: 


Source of variation 

Sum of squares 

Degrees of freedom 

Mean square 

Among weightings 

3,000.8 

2 

1,500.4 

Within weightings 

83,982.2 

9 

9,331.4 

Total 

86,983.0 

11 



No test of significance is necessary, for F is less than unity. It must 
be concluded that the effect of differences in weighting is not statistically 
significant. 

Actually this experiment employed a randomized block arrangement, 
the rows shown in the preceding table representing different sets of 
roving bobbins. 



Roller weighting 

A 

B 

C 

Roving set 

1 

1577 

1535 

1592 

2 

1690 

1640 

1652 

3 

1800 

1783 

1810 

4 

1642 

1621 

1663 


In the earlier description row differences were unallocable and varia- 
tion among rows was an important element of the large experimental 
error. In the actual randomized block arrangement rows are identified 
as roving sets and, as each weighting is represented once in each roving 
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set, the roving effects can be removed without interfering with weight- 
ing effects. The result as seen below will be a sharply reduced experi- 
mental error. 


Source of variation 

Sum of squares 

Degrees of freedom 

Mean square 

Among roller weights 

3,000.8 

2 

1,500.4 

Among roving sets 

82,619.5 

3 

27,639.8 

Residual (error) 

1,362.7 

6 

227.1 

Total 

86,983.0 

11 



We find 


1,500.4 

227.1 


= 6.61 


which for two and six degrees of freedom is statistically significant, 
(P < 0.05). The evidence of this more sensitive experiment indicates 
that differences among roller weights do affect the strength of yam. 

The second experiment was designed to measure the effect of varia- 
tions in sizing treatments on warp breakage rate. There are four 
treatments A, B, C, and D; there are two factors, loom and time differ- 
ences, whose effects we wish to eliminate. For this problem a Latin 
Square arrangement is especially advantageous for, as already men- 
tioned, neither of the two influential factors could be held constant 
throughout the experiment without augmenting the effect of the other 
on the error. If a single loom is used throughout, many time periods 
(weeks, approximately) will be needed, and variation associated with 
time will increase the error; whereas if the experiment is completed in a 
single week, many looms will be needed and loom differences will mount. 
The Latin Square arrangement effectively eliminates the principal effect 
of both sources of variation. 



Loom 

1 

2 

3 

4 

Period 

1 

44 (D) 

54 (A) 

71 (C) 

29 (B) 

2 

22 (C) 

59 (B) 

100 (D) 

22 (A) 

3 

31 (A) 

40 (C) 

79 (B) 

38 <D) 

4 

27 (B) 

83 (D) 

100 (A) 

29 (C) 
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The analysis of variance follows. 


Source of variation 

Sum of squares 

Degrees of freedom 

Mean square 

Among looms 

0,025.0 

3 

3,008 

Among periods 

370.5 

3 

124 

Among sizes 

1,389.5 

3 

463 

Residual (error) 

254.0 

6 

42 

Total 

11,039.0 

15 

.... 


We have 


F 


463 

42 


= 11.02 


which for three and six degrees of freedom is highly significant. 

An F test applied to periods yields P > 0.05, i.e., the absence of a 
significant effect. Thus, while removal of the effect of loom differences 
clearly improved the precision of the experiment, the same is not true 
of time differences. Were such an experiment to be performed again, 
the question arises as to whether a randomized block arrangement should 
be used, for only one factor (looms) has a significant effect. With the 
present data, a randomized block arrangement (looms as in the Latin 
Square but periods allocated at random) would give the information 
shown in the accompanying table. 


Source of variation 

Sum of squares 

Degrees of freedom 

Mean square 

Among looms 

9,025.0 

3 

3008 

Among sizes 

1,389.5 

3 

463 

Residual (error) 

624.5 

9 

69.4 

Total 

11,039.0 

15 



F 


463 

69.4 


= 6.67 


which yields 0.01 < P < 0.05. 

In this instance there is little to choose between the two arrangements, 
and inasmuch as we do not know in advance which factors are important, 
the Latin Square is preferable, at least for the original experiment. The 
advantage of the randomized block arrangement over the Latin Square 
is that the degrees of freedom wasted on an unimportant influence 
(periods) in the latter are allocated to error in the former. Thus the 
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error mean square of the block, while it happens to be larger, is more reli- 
able, for it is based on nine rather than slx independent variates, and the 
value of the ratio F needed to attribute significance to differences among 
sizings is smaller (3.86 as against 4.76). In the present example, this 
increase in the number of independent observations (from 6 to 9) is 
offset by the increase in the mean square (from 42 to 69.4). The effect 
of periods is not significant but it is much larger than the error of the 
Square, and this eliminates whatever advantage there might otherwise 
have been in a randomized block arrangement. 

2.16 Other examples involving analysis of variance. Campbell and 
Lovell (6) give the following data on six independent sets of laboratoiy 
knock-ratings of a fuel. 


Set 1 

Set 2 

Set3 

Set 4 

Set 5 

Set 6 

70.5 

69.7 

70.5 

71.4 

71.0 

69.5 

71.9 

70.5 

70.7 

70.5 

71.3 

70.6 

71.0 

70.4 

71.0 

71.2 

70.8 

71.5 

71.5 

70.2 

70.5 

70 8 

70.7 

. 

71.1 

71.0 

70.3 

70.1 

69.8 


70.1 

71.0 

71.2 

70.8 

70.5 


69.8 

71.4 

70.1 

71.4 

70.6 

71.0 

70.5 

70.5 

71.0 

71.0 

70.0 


70.0 

70.8 

70.4 

71.0 

69.9 

71.4 

71.1 

70.9 

70.0 

70.6 

70.8 

70.1 


70.5 

71.1 

70.6 

.... 



71.2 

71.0 

70.4 



♦ * * * 


71.4 






71.0 






71.0 




• » » • 


71.2 




.... 


70.4 





, Do the laboratory mean ratings differ significantly among them- 
selves? Or may the six sets of ratings be combined? 

The principal difference between this and preceding examples of 
completely randomized arrangements lies in the fact that the column 
means X c are not equally important, for they are based on different 
numbers of observations. This fact affects somewhat the validity of 
the following analysis, but we shall assume this effect to be slight. 
The within-sets sum of squares is 

Z(x - £c ) 2 

- - [“ (17 + “Of)' + • • ■ + “ (17] 
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and the among-sets term is 
Z(X C - X) 2 = Y,X 2 C - nX 2 



The analysis of variance follows: 


Source of variation 

Sum of squares 

Degrees of freedom 

Mean square 

Among sets 

0.63 

5 

■■ 

Within sets (error) 

16 86 

66 

■HI 

Total 

17.49 

71 



Actually, the mean square error is less (although not significantly 
less) than the mean square associated with variability among sets. No 
F test will be applied. The differences among the means of the six sets 
are not significant; the sets are homogeneous and they ca n be combined. 
The mean will be 70.7 and the standard error of the mean V (17.49/7 1)/72 
= 0.059. 

If the data satisfy the assumptions underlying the method of analysis 
of variance, variation attributable to the action of specific factors and/or 
to their interaction can always be isolated. We shall give several examples. 

Assume that two factors (and their interaction, see 2.17) are suspected 
of being responsible for variation. These three factors are isolated and 
the variation due to each is compared with the variation due only to 
experimental error; we thereby determine whether or not our suspicions 
are justified. 

The data must provide a satisfactory estimate of experimental error 
and in the examples to be given this is true; for each value of the sus- 
pected causes (in the first example these causes are differences among 
lots and differences among rolls) there are several (three) values of the 
variable being studied (porosity). Within each set of three readings on 
porosity there is no change of lot or roll; whatever differences there are 
within each set of three readings are attributed to a large number of 
independent and unknown causes, each of which has a small effect. In 
short, these differences constitute experimental error. 

Rider (34, a) gives the following Western Electric Company data on 
the porosity of condenser paper. Three readings are made on each of 
nine rolls from each lot. 
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Reading 

Roll number 



number 

1 

2 

3 

4 

5 

6 

7 

8 

9 



1 

1.5 

1.5 

2.7 

3.0 

3.4 

2.1 



5.1 


I 

2 

1.7 

1.6 

1.9 

2.4 

5.6 

4.1 

2.5 


5.0 



3 

1.6 

1.7 

2.0 

2.6 

5.6 

4.6 

2.8 

1.9 

4.0 

Lot 


1 

1.9 

2.3 

1.8 

1.9 

2.0 


2.4 

1.7 

2.6 

II 

2 

1.5 

2.4 

2.9 

3.5 

1.9 

2.6 

rail 

1.5 


number 


3 

2.1 

2.4 

4.7 

2.8 

2.1 

3.5 

2.1 

2.0 




I 

2.5 

3.2 

1.4 

7.8 

3.2 

1.9 

2.0 

1.1 

m 


III 

2 

2.9 

5.5 

1.5 

5.2 

2.5 

2.2 

2.4 

1.4 

2.5 



3 

3.3 

7.1 

3.4 



3.1 


4.1 

1.9 


The method of analysis of variance does not automatically suggest the 
appropriate breakdown of the data. Thus we might study the variance 
resulting from the differences Jpetween the means of rolls X r (including 
all lots) and the grand mean X , the corresponding sum of squares being 

Z(X r - X) 2 

This would have slight value, for the position, say number 1, of a roll 
has no real meaning from lot to lot. The appropriate breakdown of the 
total sum of squares is 

Among Among rolls Among measurements 

Total lots within lots within lot-rolls 

E(X - x) 2 = Z(Xi - X) 2 + Z(X rl - h? + Z(x - 2ri? 

where Xj is the mean of a lot including all rolls, X-i is the mean of a roll 
for a given lot, etc. Notice that, if the summation signs and squares are 
removed, the symbols “ cancel out.” Analysis of variance yields the 
following table: 


: 

Source of variation 

Sum of squares 

Degrees of 
freedom 

Mean square 

Among lots 

7.90 

2 

3.95 

Among rolls within lots 

92.87 

24 

3.87 

Among measurements 
within lot-rolls (error) 

42.29 

54 

0.78 

Total 

143.06 

80 

— 


As before, we may use as a practical rule the fact that each sum of 
squares has degrees of freedom equal to the number of variates s umme d 
Iras the number of independent relations between the variates. In the 
present example there are eighty-one observations so the variance esti- 
mate involving Z)(X — X) 2 will be based on eighty degrees of freedom, 
the mean X having been calculated from the observations. The among- 
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lots sum of squares ~ X) 2 involves three lot means less one 

relationship between them (again the grand mean); hence there are 
two degrees of freedom for the estimate based on this sum of squares. 
For among-rolls-within-lots,_with sum of squares D (X ri - X t ) 2 there 
are twenty-seven values of X r i less three values of X h or 24 degrees of 
freedom. _In the among-measurements within-lot-rolls, sum of squares 
Z(X - X Tl ) 2 , 81 values of X ( are summed but there are 27 relations 
among the X* (27 roll-lot means X r i), leaving 54 degrees of freedom. 

Another breakdown of the data would involve a split of the “ among 
rolls within lots ” sum of squares 

H(Xri - X *) 2 

into two terms, first, among rolls, 

L(X r - z ) 2 

and a term which, as will be demonstrated, shows the joint or inter- 
action effect of lots and rolls on porosity 

Z(X r ; - X r - Xi + X) 2 

As in the previous breakdown, there will be 2 degrees of freedom for lots 
and 54 degrees of freedom for error. But the 24 degrees of freedom 
previously allocated to “ among rolls within lots ” must now be allo- 
cated to (a) among rolls and ( b ) interaction of lots and rolls. There are 
nine rolls and one restriction in the form of the grand mean, so the 
among-rolls estimate of the population variance is based on 8 degrees of 
freedom. By subtraction, 16 degrees of freedom are allocable to the 
interaction estimate of a 2 . Again notice the cancellation of symbols 
when the exponents and summation signs are removed. 

The degrees of freedom allocated to the interaction (of lots and rolls) 
estimate will be shown at the end of this chapter to be the product of the 
degrees of freedom allocable to the constituent factors (among lots 
with 2 and among rolls with 8 degrees of freedom). 

Analysis of variance yields the following table. 


Source of variation 

Sum of squares 

Degrees of freedom 

Mean square 

Among lots 

7.90 

2 

3.95 

Among rolls 
Interaction of 

26.32 

8 

3.29 

rolls and lots 

66.55 

16 

4.16 

Error 

42.29 

54 

0.78 

Total 

143.06 

80 



But differences among rolls for all lots have no practical significance 
whereas roll differences within lots are meaningful; hence the earlier 
breakdown of variability in quality is to be preferred. 
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The F test, as used in the analysis of variance, is essentially the ratio 
of variability associated with a suspected cause to error. For rolls 
(in the earlier breakdown) the appropriate ratio is 


F ro lIa ”~ 


3.87 

0.78 


= 4.96 


From Table VIII, for 24 and 54 degrees of freedom, the 1 per cent level 
value of F ro ik = 2.16. The actual value of F exceeds this critical value. 
Variation in porosity is therefore partly attributable to differences among 
rolls within lots and, if possible, these differences should be eliminated. 

In judging the variability among lots, the appropriate error sum of 
squares is 92.87 plus 42.29, with 24 plus 54 degrees of freedom, for both 
factors associated with these quantities clearly contribute to the error 

3.95 

of comparing lot means . We have F = = 2.28 which, from Table 

JL.7o 

VIII, for 2 and 78 degrees of freedom, is not significant. 

Rider (34, a) gives the following Western Electric Company data on 
impact strength, in foot-pounds, of specimens of insulating material. 
The specimens were cut lengthwise and crosswise from the sheets as 
indicated. 



Specimen 

number 

Lot number 

I 

II 

III 

IV 

V 



1 

1.15 

1.16 


0.96 

0.49 



2 

0.84 

0.85 

B£1 

0.82 

0.61 



3 

0.88 

1. 00 

0.64 

0.98 

0.59 



4 

0.91 

1.08 

0.72 

0.93 

0.51 


Lengthwise 

5 

0.86 

0.80 

0.63 

0.81 

0.53 


specimens 

6 

0.88 

1.01 

0.59 

0.79 

0.72 



7 

0.92 

1.14 

0.81 

0.79 

0.67 



8 

0.87 

0.87 

0.65 

0.86 

0.47 

3 


9 

0.93 

0.97 

0.64 

0.84 

0.44 

o 

*3 


10 

0.95 

1.09 

0.75 

0.92 

0.48 



1 

0.89 

0.86 

0.52 

0.86 

0.52 



2 

0.69 

1.17 

0.52 

1.06 

0.53 



3 

0.46 

1.18 

0.80 

0.81 

0.47 



4 

0.85 

1.32 

0.64 

0.97 

0.47 


Crosswise 

5 

0.73 

1.03 

0.63 

0.90 

0.57 


specimens 

6 

0.67 

0.84 

0.58 

0.93 

0.54 



7 

0.78 

0.89 

0.65 

0.87 

0.56 



8 

0.77 

0.84 

0,60 

o;$s 

0.55 



i 9 

0,80 

1.03 

0.71 

0.89 

0,45 



i 10 

0.79 

1.06 

0.59 

0,82 

0.60 
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The appropriate breakdown, written symbolically, follows: 

Between types of cut {X c — X) 

Among lots (Xi — X) 

Among specimens within a lot and within a type of cut, that is, 
experimental error 

(X - Xu) 

The total variability is of the form (A — X); we have 

(X - X) = {Xc - X) + {X t -X) + {X- X le ) + R 
from which R is of the form 

{Xu-Xc-Xi + X) 

a term which will presently be shown to symbolize the interaction or 
joint effect on impact strength of cuts and lots. 


Source of variation 

Sum of squares 

Degrees of freedom 

Mean square 

Between types 




of cut 

0,0454 

1 

0.0454 

Among lots 
Interaction of 

2,7912 

4 

0.6978 

cuts and lots 

0.1417 

4 

0.0354 

Error 

0.8947 

90 

0.0099 

Total 

3.8730 

99 



The cut-lot interaction sum of squares is best calculated by writing the 
following totals: 



Lot number 

I 

II 

III 

IV 

V 

Total 

Cut 


9.19 

9.97 

6.90 

8.70 

5.51 


Crosswise 



6.24 

8.99 

5.26 

38.14 

Total 








We have for this table 

£<X - Xf = Efc - X? + Ldiot - X? 
+ XL{X-X mi - X l<!t + X ) 2 
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E(Z -£)*- io(3f) s + io(?f)V- 

+io ( 5 u) ! - ioo (^T- 2 - 37 ® 

_ / a ^ s2 / 40.27\ 2 , /38.14\ 2 /78.41' 

E«.-Z)»-so(— ) +50(— ) - 1°0 (tm 

- 0.0454 


2Z(Xi 0 t *— X) 2 “ 2.7912 (already calculated) 

Interaction sum of squares = 2.9783 — 0.0454 — 2.7912 == 0.1417 


Notice that the breakdown favored in this example is equivalent to 
the one not favored in the previous example. In the previous example, 
the column headings were roll numbers each of which was without 
meaning when taken over all lots. The number three roll, for instance, 
was the third roll chosen in each lot and if the selection is made at ran- 
dom, there is no reason to expect that the third roll should differ signifi- 
cantly from the others. In the present example, a column comprises 
one lot and lot differences are meaningful. Hence in the present exam- 
ple we are interested in 


HiXcoium* - X) 2 


Finally 


„ 0.0454 , 

Fmt 0.0099 4-59 ’ 

F critical (.05) = 3.95 

„ 0.6978 . n 

Flot 0.0099 70 ' 49 ’ 

F critical (.05) ~ 2.47 

0.0354 

^ interaction - q qqqq ~ 

F critical!(.05) = 2.47 


Both cuts and lots, and their joint effect, are significantly responsible 
for variable quality. 

As a final example, the following Western Electric Company data are 
given by Rider (34, a). They deal with the thickness of coating, in 
0.0001 of an inch, on fibre strips sprayed with varnish. Measurements 
were taken at each of five different points on each of the three strips 
selected from eaeh of five lots. 
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Strip 

number 

Point number 

1 

2 

3 

4 

5 



1 

10 

8 

10 

9 

7 


I 

2 

8 

8 

8 

8 

10 



3 

8 

10 

10 

6 

7 



1 

13 

12 

12 

12 

13 


II 

2 

10 

9 

13 

11 

8 



3 

11 

8 

10 

12 

12 



1 

12 

13 

14 

17 

16 

Lots 

III 

2 

17 

10 

13 

10 

14 



3 

12 

11 

13 

16 

13 



1 

14 

13 

17 

11 

11 


IV 

2 

11 

9 

13 

11 

12 



3 

17 

13 

14 

13 

8 



1 

9 

13 

17 

13 

11 


V ; 

2 

8 

11 

10 

12 

11 



3 

7 

14 

14 

9 

9 


When analyzing data which come from an experiment designed and 
carried out by someone other than the analyst, one must know what 
variation in the data is random variation. The term we call u error ” 
would consist of variability in thickness among points for a given strip 
within a given lot if (a) the points w T ere distributed at random and (b) 
the three strips were, say, consistently of three different kinds. If, 
however, the strips are randomly chosen from the lot while the different 
points refer systematically to certain parts of a strip, i.e., they are not 
randomly selected, the error term would better consist of the estimate 
of variance from the term among-strips. In the present example the 
former is true. The appropriate analysis is, therefore, 

(а) variability among lots (Xj — X) 

(б) variability among strips within lots (X s i — Xj) 

(c) variability among points within strips ( X — X s {) 

Symbolically 

(X - X) « Hi - X) + (X 8l - X t ) + (X-~ Ht) + B 
from which 

The (X — 2 $ i) term is the experimental error term. 
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Source of variation 

! 

Sum of squares 

Degrees of 
freedom 

Mean square 

Among lots 

207.92 

4 

51.98 

Among strips within lots 

49.20 

10 

4.92 

Among points within strips 
(error) 

277 20 

60 

4.62 

Total 

534.32 

74 



We have 


and 


the value 4.66 being the mean square compounded from among strips 
within lots and among points within strips (as in the example on page 
72). The 5 per cent level values are 2.50 and 1.99. Only the lot 
variation is statistically significant. The association of strips with vari- 
ation in coating thickness is not statistically significant. 

2.17 Interaction. To illustrate the meaning of interaction, con- 
sider the two following examples (from Snedecor, 38). 


- If - u - 15 


'«*• - SI - 106 



Column 

Mean 

1 

2 

3 

Bow 

1 

1.8 

2.0 

1.4 

1.73 

2 

1.6 

00 

r-4 

1.2 

1.53 

3 

1.3 

1.5 

0.9 

1.23 

Mean 

1.57 

1.77 

1.17 

1.50 


Analysis of variance yields the following table: 


Source of variation 

Sum of squares 

Degrees of 
freedom 

Mean square 

Bows 

0.38 

2 

0.19 

Columns 

0.56 

2 

0.28 

Interaction of 
rows and columns 

0 

4 

0 

Total 

0.94 

8 j 

— 
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where 

Total Rows Columns Interaction 

Z(X„*- X ) 2 = UXr- X ) 2 + L(X e - X ) 2 + E(X«- X c - X r + X ) 2 

The interaction sum of squares 

Z(x-z r ~z c + x ) 2 

is given by 

( 1.8 - 1.73 - 1.57 + 1 . 50) 2 
+ ( 1.6 - 1.53 - 1.57 + 1 . 50) 2 
+ • • • 

each term of which is zero. To appreciate the meaning of zero inter- 
action, notice that in proceeding from the first to the second column 
of the original data, all variates are increased by the same (absolute, not 
percentage) amount (0.2) and from the second to the third column all 
variates are decreased by the same amount (0.6), and similarly for 
rows. Variation among observations from column to column is the same 
regardless of which row is considered, i.e., there is no “ interaction ” 
between columns and rows. 

As a second example, consider 



Column 

Mean 

1 

2 

3 

Row 

i 

1.6 

2.0 

0.8 

1.47 

2 

1.5 

1.0 

1.9 

1.47 

3 

1.3 

1.4 

1.7 

1.47 

Mean 

1.47 

1.47 

1.47 

1.47 


Analysis of variance yields the following table: 


Source of variation 

Sum of squares 

Degrees of freedom 

Mean square 

Rows 

0 

2 

0 

Columns 

Interaction of 

0 

2 

0 

rows and columns 

1.24 

4 

0.31 

Total 

1.24 

8 

.... 


* X . 
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In this case, all variation is attributable to interaction. As one pro- 
ceeds from column to column, the amount and direction of change of 
the variate is complete^' dependent on the row; thus from the first to 
the second column the variate increases by 0.4 in the first row, decreases 
by 0.5 in the second, and increases by 0.1 in the third. The algebraic 
sum of these changes is zero. Separately, the suspected u causes ” 
(rows and columns) are responsible for none of the variability; operating 
jointly they are responsible for all of the variability. 

Practical problems yield interactions somewhere between the extremes 
shown in these two examples. Finally, when more than two “ causes ” 
are under investigation (as in the following example) more complex inter- 
action terms will be produced; these may be interpreted analogously to 
the above. 

2.18 Formal analysis of variance. Data can be classified with 
respect to any number of factors (causes). In the few published 
examples dealing with four or more factors, certain effects are a priori 
considered unimportant and the analyst therefore uses a plan of anal- 
ysis appropriate to his data but one which is rarely useful elsewhere. 


i 

Pot 1 

Pot 2 


Journey 

Cylinder 

Cylinder 


3 

10 

16 

3 


16 




1 

47 

56 ! 

100 

52 j 

61 

88 



2 

55 

89 

93 

49 

62 

97 


I 

3 

35 

57 

56 

34 

60 

72 



4 

78 

67 

113 

47 

93 

118 



5 

33 

40 

128 

16 

29 

130 



1 

52 

66 

36 

65 

80 

40 



2 

21 

61 

49 

122 

97 

79 


II 

3 

31 

39 

25 

45 

54 

72 



4 

43 

72 

52 

109 

120 

80 

Run 


5 

37 

51 

67 

67 

85 

63 



1 

50 

61 

60 

75 

139 

130 



2 

33 

27 

49 

46 

58 

63 


III 

3 

24 

39 

24 

15 

33 

39 



4 

18 

18 

43 

22 

16 

19 



5 

28 

42 

28 

27 

19 

22 



1 

24 

34 

43 

46 

66 

24 



2 

24 

49 

42 

40 

117 

105 


IV 

3 

21 

21 

51 

30 

28 

34 



4 

21 

69 

48 

36 

64 

53 



5 

76 

48 

42 

39 

60 

78 














DIFFERENCES AMONG SEVERAL MEANS 


79 


Experience indicates that students have difficulty following such non- 
systematic procedures. 

Examples involving multifold classification can always first be an- 
alyzed formally (systematically) and combining of terms can be left to 
the end. The following is an example of this procedure. 

Tippett (42, a) gives the data in the previous table, from a paper 
by Gould and Hampton (21) on the mean number of seed (defects) 
per unit area of spectacle glass. Four factors (runs, journeys, cylin- 
ders, and pots) may affect the seed count; accordingly the experiment is 
conducted and the data are classified with respect to these factors. 
Cylinders of glass are made in pots, a journey is equivalent to a day, 
and glass made on consecutive days from the same pots constitutes a run. 
Three cylinders of the eighteen made (numbers 3, 10, and 16 in the 
order of manufacture) were studied. 

These data can be broken down in many ways. Tippett uses the 
following breakdown, which yields mean squares having the greatest 
practical interest. 


Source of variation 

(a) Within pots 

(1) Between cylinders 

(2) Between journeys 

(3) Residual within pots 

(4) Total 

(b) Between pots 

(5) Between runs 

(6) Residual between pots 

(7) Total 

(c) Between cylinders 

(8) Common to all runs 

(9) Common to both pots in run less (8) 

(10) Specific to pot 

(11) Total 

(d) Between journeys 

(12) Common to all runs 

(13) Common to both pots in run less (12) 

(14) Specific to pot 

(15) Total 


Degrees of freedom 


16 

32 

64 


112 


3 

4 


7 


2 

6 

8 


4 

12 

16 


Grand total 


119 


The original data provide no completely satisfactory index of experi- 
mental error for there is but one measurement for each combination of 
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suspected causes; it is the practice, particularly in unreplicated agri- 
cultural experimentation, to take a complex (uninterpretable) inter- 
action term (interaction of pots, runs, journeys, and cylinders) as 
experimental error. 

If we have two suspected causes (rows and columns), there are 

C\ + Cl = 3 

terms in a formal breakdown, C\ representing the number of combinations 
of two things taken one at a time, etc. 



Source of variation 

Sum of squares 

Degrees of freedom 

Among columns 

Z(Xc -x) 2 

c — 1 

Among rows 

Interaction 

Z(Xr-X? 

r — 1 

(columns X rows) 

E< 'X-l'-Xr+D* 

(a - 1 )<r - 1) 

Total 

E(x-2) a 

cr — 1 


For a three-cause formal breakdown, there are 

C\ + Cl + C| - 7 

classifications. 

c columns 













r rows 






m 

B 

B 




m 

S 

B 



ii 

i 

ii 




g groups 
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Source of variation 

Sum of squares 

Degrees of freedom 

Among columns 

M 

✓— \ 

I 

to 

c — 1 

Among rows 

Z(X r -X) 2 

r — 1 

Among groups 

Z(X 0 -X) 2 

0-1 

First-order interaction 
(columns X rows) 

Edc-ie-lr + D 

0 - l)(r - 1) 

First-order interaction 
(columns X groups) 

£(X CS ~ X c - X 0 + X) 2 

0 - DO ~ 1) 

First-order interaction 
(rows X groups) 

ZiXrc-Xr-Xp + X) 2 

'-T 

1 

1 

I-* 

Second-order interaction 
(columns X rows 

X, groups) 

Z(X - X„ ~ X ct ~ X r0 
+ x c +x r + X„ - X ) 2 

0 - 1 ) (r - l)(g - 1) 

Total 

£(X-X) 2 

erg - 1 


For a four-way formal breakdown, we have 

a + ci + a + a - 15 


classifications. The numbers in parentheses show the degrees of freedom 
associated with the mean squares for the data of Gould and Hampton. 


Runs 

Main effects 
ZiXr-X) 2 

(r- 1) 

(3) 

Pots 

Z(Xp-X) 2 

(P-1) 

(1) 

Journeys 

ZiXj-x) 3 

O’ - 1) 

(4) 

Cylinders 

£{X C -X) 2 

(c-1) 

(2) 

Runs X pots 

First-order interaction 

Eg Xr. p- X r- X p+2) 2 

(r-l)(p-l) 

(3) 

Rims X journeys 

H(X ri -X r -Xj + X) 2 

(r — 1)0 — 1) 

(12) 

Runs X cylinders 

E (Xrc -X r -2,+ X) 2 

(r - 1)0 - 1) 

(6) 

Pots X journeys 

ZiX P} - x p - Xj + X) 2 

“S' 

1 

H-* 

C: 

1 

(4) 

Pots X cylinders 

ZiXpc -Xp-x c + X) 2 

(p- 1)0-1) 

(2) 

Journeys X cylinders 

£{X, 0 -Xj-Xc + X) 2 

0 - 1)0-1) 

(8) 

Second-order interaction 
Runs X journeys X cylinders 

E {.Xric ~ X ri - X TC - Xj c + X r + Xj + X a - X) 2 (r 

-1)0 -DO- 

1) (24) 


Runs X journeys X pots 

E&* - Xri -Xrp- %iv +Xr + Xj + 1, - %■)* (r -1)0’- l)(p - 1) (12) 
Journeys X cylinders X pots 

Ztficp - Xic - x iP - X'P + Xj + z c + - £) 2 o-DO-D(p- 1) (8) 

Runs X cylinders X pots 

Z&T* - Xtc -Xrp- X'P + X r + X e + Xp- X ) 2 (r - 1)0 - l)(p - 1) (6) 
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Third-order interaction 
Runs X journeys X cylinders X pots 
£ (Xricp* -_X rl o ~ X Hp - X rjp + X rj + S„ 

+ X jc + Xrp +Jip + X cp -Xr-X, 
-X c -X p +X ) 2 (r -!)(/■ 


l)(c - l)(p -1) (24) 


Total £(X - X) 2 


rjcp — 1 


(119) 


It is easier to compute and to appreciate the meanings of these terms if 
one sets down portions of the data in the three-cause form. Instead of 
writing the means in each case, the totals are used, for although we are 
always dealing with means in the form £X/n in variance analysis, the 
means themselves need seldom be computed; is sufficient and more 
convenient. 


Table a 


Table b 



Run 

1 

2 

3 

4 


1 

404 

339 

515 

237 

& 

2 

445 

429 

276 

377 

s 

3 

314 

266 

174 

185 

1 

4 

516 

476 

136 

291 


5 

376 

370 

166 

343 


Table c 




Run 



1 

2 

3 

4 

"o 

1 



544 

613 

Ah 

2 

1008 

1178 

723 

820 


Table e 




Journey 



1 

2 

3 

4 

5 

■§ 

1 

629 

592 

423 

642 

620 

Ah 

2 

866 . 

935 

516 

777 

635 


* R T j cp = X. 




Run 



1 

2 

3 

4 


3 

446 

592 

338 

357 

10 

614 

725 

452 

556 

5 

16 

995 

563 

477 

520 


Table d 




Journey 



1 

2 

3 

4 

5 


3 

411 

390 

235 

374 

323 

M 

10 

563 


331 

519 

374 


16 

521 

577 1 

! 

373 

526 

558 


Table / 




Cylinder 



3 

10 

16 

3 

1 

751 

1006 

1149 

Ah 

2 

982 

1341 

1406 
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Table g 



Run 1 

Run 2 

Run 3 

... 

Run 4 

Cylinder 

Cylinder 

Cylinder 

Cylinder 

3 


16 

3 


16 

3 

10 

16 

3 

10 

16 


1 

99 

117 

188 

117 

146 

76 

125 


190 


100 

67 

o 

2 

104 

151 

190 

143 

158 

128 

79 

85 

112 

64 

166 

147 

s 

3 

69 

117 

128 

76 

93 

97 

39 

72 

63 

51 

49 

85 

S3 

4 

125 

■rail 

231 

152 

192 

132 

40 

34 

62 

57 

133 

101 


5 

49 ; 

69 

258 


136 


55 

61 

50 

115 

108 

120 


Table h 



Run 1 

Run 2 

| Journey 

i ... . _ 

Journey 

% 

2 

3 

4 

5 

1 

2 



5 

o 

1 

203 

237 

mm 

258 


154 

131 




p-t 

2 

201 

| 208 

m 

258 


| 185 

298 






Run 3 

Run 4 



Journey 

Journey 



1 

2 


4 

5 

1 


3 

4 

5 

■g 

1 

171 

109 

87 

79 

98 

101 

115 

93 

138 

166 

(S 

2 

344 

167 

87 

57 

68 

136 

262 

92 

153 

177 


Table i 



Pot 1 

Pot 2 

| Cylinder 

Cylinder 

3 

10 

16 

3 

10 

16 


1 

173 

217 

239 

238 

346 

282 

Q 

2 

133 

226 

233 

257 

334 

344 


3 1 

111 

156 

156 

124 

175 

217 

O 

4 

160 

226 

256 

214 

293 

270 


5 

174 

181 

265 

149 

193 

293 
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Table j 



Pot 1 

Pot 2 

Cylinder 

Cylinder 

3 

10 

16 

3 

10 

16 


i 

248 



198 

305 


g 

2 

184 

289 

I 229 ! 


436 

334 

£ 

3 

153 

187 


185 

265 

273 


4 

166 

221 

226 

191 

335 

294 


Each table need not be completely analyzed, for there is a certain 
amount of duplication. Thus Table a will yield 


£(*r- 

'LiXrj 


If 

If 

-X r - 


Xj + Xf 


Among runs 
Among journeys 
Interaction (runs X journeys) 
and Table c will yield 
Among runs 
Between pots 
Interaction (runs X pots) 

duplicating “ among-runs similarly for other tables. The formal 
four-way breakdown of the data is carried out in the table below. 


£(Xr - X ) 2 
£(1 P -X) 2 
Z(X Tp - Xr- X p + Xf 


Term 

number 

Source of variation 

Sum of 
squares 

Degrees of 
freedom 

Mean 

square 

1 

Runs 

13,679.89 

3 

4,559.96 

2 

Journeys 

9,684.00 

4 

2,421.00 

3 

Cylinders 

9,132.87 

2 

4,566.44 

4 

i Pots 

i 

5,644.41 

1 

5,644.41 

5 

Runs X journeys 

18,650.07 

12 

1,554.17 

6 

Runs X cylinders 

11,532.73 

6 

1,922.12 

7 

Runs X pots 

4,455.16 

3 

1,485.05 

8 

Journeys X cylinders 

1,992.55 

8 

249.07 

9 

Journeys X pots 

2,727.13 

4 

681.78 

10 

Cylinders X pots 

146.46 

2 

73.23 

U 

Runs X journeys X cylinders 

9,104.18 

24 

379.34 

12 

Runs X journeys X pots 

6,855.46 

12 

571.29 

13 

Journeys X cylinders X pots 

917.12 

8 

114.64 

14 

Runs X cylinders X pots 

1,320.47 

6 

220.08 

15 

Runs X journeys X cylinders X 
pots 

6,384.29 

24 

286.01 


Total 

102,226.79 j 

119 
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Practical interest would be expected primarily to center on the signifi- 
cance of the variability between pots, among runs, among journeys, 
and among cylinders. With this in mind how can we best classify the 
above fifteen terms? Note that an interaction term, such as, say term 9, 
can be classified beneath either of two headings — variability among 
journeys or between pots. “ Between pots ” has no interest, for 
the pots were not used in any particular sequence. Hence terms such 
as 9, involving pots and another factor, should be eventually listed under 
the other factor. In a similar way, “ between runs ” means little, 
for the runs are quite independent of each other. Finally, there is no 
question as to how the main effects (terms 1 to 4) are to be classified as 
only one factor is involved in each. 

We obtain the following table: 

Degrees op 


Source op variation freedom" 

Among runs 

Runs (term 1) 3 

Between pots 

Pots (term 4) 1 

Among journeys 

Journeys (term 2) 4 

Runs X journeys (term 5) 12 

Journeys X pots (term 9) 4 

Journeys X pots X runs (term 12) 12 

Among cylinders 

Cylinders (term 3) 2 

Runs X cylinders (term 6) 6 

Cylinders X pots (term 10) 2 

Runs X cylinders X pots (term 14) 6 


The following terms are not automatically placed by the criterion of 
practical interest (run and pot variation of no interest) : 

Degrees of 

FREEDOM 


Term 7 (runs X pots) 3 

Term 8 (journeys X cylinder) 8 

Term 11 (runs X journeys X cylinders) 24 

Term 13 (journeys X cylinders X pots) 8 

Term 15 (runs X journeys X cylinders X pots) 24 


Of these the first will be arbitrarily classified as a between-pot term. 
The others are all journey X cylinder terms and constitute a residual 
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interaction, as will be seen from the following addition. 



jcpr X jp r X c p r -f- Xp r 


which is the interaction term of Xj pT and X cpr , i.e., the joint effect on 
seed of journeys and cylinders, the effect differing from one pot-run to 
another. This is a secondary or residual effect and is classified as such. 


Source of variation 

: 

Sum of 
squares 

Degrees of 
freedom 

Mean 

square 

Among journeys 




Overall journey 

9,684.00 

4 

2,421.00 

Runs X journey 

18,650.07 

12 

1,554.17 

By pot (9 + 12) 

9,582.59 

16 

598.91 

Among cylinders 




Overall cylinder 

9,132.87 

2 

4,566.44 

Runs X cylinder 

11,532.73 

6 

1,922.12 

By pot (10 + 14) 

1,466.93 

8 

183.37 

Between pots 




Pots 

5,644.41 

1 

5,644.41 

By runs 

4,455.16 

3 

1,485.05 

Among runs 

13,679.89 

3 

4,559.96 

Residual 




(8 + 11 4- 13 + 15) 

18,398.14 

64 

287.47 

Total 


119 



The accompanying table is equivalent to the table set down by Tippett. 
Analysis of the mean squares may now be carried out, and the reader is 
referred to Tippett (42, a) for further discussion and final conclusions. 

2.19 The L tests. It has been noted that a test of the homogeneity 
of variances (Lx) should precede the F test if the F test is to be con- 
strued as a test of the homogeneity of means. The L t test was used in 
the first chapter; we shall now illustrate this test in greater detail. 
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In addition a new preliminary test known as the L 0 test will be illus- 
trated. The three tests L 0j L h and F are most informative when used 
together, as will be shown presently. 

Rider (34, a) gives the following Western Electric Company data on 
the breaking strength in pounds tension of cement briquettes. 


Batches 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

518 

508 

554 

555 

536 

544 

578 

530 

590 

542 

560 

574 

598 

567 

492 

502 

532 

564 

554 

556 

538 

528 

579 

550 

528 

548 

562 

536 

530 

590 

510 

534 

538 

535 

572 

562 

524 

540 

572 

546 

544 

538 

544 

540 

506 

534 

548 

530 

525 

522 


There are 50 observations, divided among 10 samples, each sample 
having 5 observations. The notation will be k = 10, n = 5, and N = 50. 
The questions which the Lo> and F tests can answer are: 

(1) Could these 10 samples belong to normal populations having the 
same mean and the same variance? (L 0 test.) 

(2) Could these 10 samples belong to normal populations of the same 
variance, no stipulations being made as to the mean? ( Li test.) 

(3) Could these 10 samples belong to normal populations whose means 
are appreciably the same and whose variances are assumed the same? 
(F test.) 

The functions devised by Neyman and Pearson (30) are respectively: 


[5] 

[ 6 ] 

where 


r _ ( S 1 • S 1 : • • S * V 
0 \4 • 4 • • • 4/ 

/s? • 4 • • • sfv 

1,1 Va • 4 • • • 4/ 


ns \ 


Z(X 

a 


Xx ) 2 


n s| 


I(X-Z 2 ) 2 

i 


nk k 

Nsl = E(Z - X) 2 , Nsl = and N = nk 

1 1 


4 are the within-sample variances, si is the variance based on the devi- 
ation of all N observations about their mean X, and si is the mean of the 
within-sample variances. X{ are the sample means. For theoretical 
reasons, only the case of equal sample (batch) sizes 


can be considered. 


?Zi — 712 ~ n 
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In both the Lo and Li tests, if the hypothesis that the data do belong 
to the specified normal population is true, the value of L tends to unity, 
although the occurrence of unity will be a highly unlikely event even if 
the hypothesis is true, for L is subject to sampling error. The less the 
data support the hypothesis, the nearer for given n will the value of the 
corresponding L come to zero. 

The distributions of Lq and Li have been approximated by Neyman 
and Pearson (30) and tables have been prepared by Mahalanobis 
(26) (Tables IX and X) and by Nayer (28). 

In computing L 0 and L h we introduce the geometric mean Sg of the 
within-sample variances $?• 

4 = (4 • 4 • ■ - 4) 1/fc 

log Sg = | (log sf + log si H + log si) 

Then 

log L 0 = log sl - log So 
log Zq = log sl - log sl 

WlTHIN-SAMPLE VARIANCE 
SAMPLE Sf 

1 324.80 

2 459.84 

3 509.44 

4 127.44 

5 754.56 

6 404.80 

7 384.96 

8 158.40 

9 607.36 

10 498.56 

sg = 528.96 
si = 423.02 
4 = 374.75 

L 0 = 0.708 

Li = 0.886 


from which 
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For n = 5 and k ~ 10, the 5 per cent level of L 0 (Table IX) is 0.4857. 
We have L 0 ~ 0.708. The 10 samples are homogeneous with respect 
to mean and variance. 

As the test indicates that the samples came from the same normal 
population (i.e., from normal populations of the same mean and the 
same variance) the L\ and F tests must both fail (i.e., show no signifi- 
cance) for they test the nature of any non-homogeneity disclosed by the 
Lo test. From Table X, the 5 per cent level of L\ is 0.6318, indicating 
that the L\ hypothesis is upheld; finally, the following analysis of 
variance shows that the samples are homogeneous in their means, i.e., 
the null F hypothesis is upheld. 


Source of 
variation 

Sum of 
squares 

Degrees of 
freedom 

Mean square 

Among batches 

5,297.22 

9 

588 58 

Within batches 

21,150 80 

40 

528 77 

Total 

26,448.02 

49 

— 


The following statistics have been computed from data and calcula- 
tions given by Dudding and Baker (10). The original data deal with 
the breaking strain of glass tubing. Each sample contains eight obser- 
vations. 


Sample 

Mean 

Within-sample variance 

4 

1 

1010 

4,025 

2 

1100 

38,013 

3 

1020 

38,488 

4 

1100 

6,700 

5 

1070 

34,300 

6 

1180 

15,475 

7 

1030 

12,375 

8 

1180 

43,100 

9 

1040 

14,775 

10 

1200 

11,238 

11 

970 

38,088 

12 

1050 

50,825 

13 

840 

58,675 

14 

970 


15 

1060 

13,413 

16 

1130 

6,838 
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We find 

4 364 878 

si = — — = 34,101 (see following analysis of variance) 

128 

si = 26,182 
log si = 4.2999 
from which 

L 0 = 0.58 
Li = 0.76 

The 5 per cent levels are, approximately 

L 0 = 0.62 
Li = 0.73 


The L 0 hypothesis is not supported, i.e., the samples differ significantly 
in their means and/or variances. The result of the L\ test indicates 
that the samples do not differ significantly in their variances; hence 
they must differ in their means, and an analysis of variance should 
support this expectation. 


Source of variation 

j 

Sum of squares 

Degrees of 
freedom 

Mean square 

Among samples 

Within samples (error) 

1,013,550 

3,351,328 

15 

112 

67,570 

29,923 

Total 

4,364,878 

127 



From the above table 

F = 2.26 

which for 15 and 112 degrees of freedom is significant. 

A final example is based on data given by Campbell and Lovell (6) 
on octane ratings of motor fuels. The fuels, which are of known compo- 
sition, are rated blind eight successive times in eight different makes of 
car. The results are shown in the two following tables. 
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63.3 OCTANE NUMBER FUEL 



Rating number 

1 

2 

3 

4 

5 

6 

7 

8 


A 

63.8 

65.7 

65.7 

65.7 

63.8 

67.6 

67.6 

65.8 


B 

68.8 

55.0 

56.3 

61.2 

61.2 

61.2 

61.2 

61.2 


C 

60.0 

60 0 

60.0 

63.8 

56.3 

67.6 

67.6 

72.8 

Car 

D 

58.7 

62.2 

62.2 

63.8 

62.2 

62.2 

62.2 

62.2 

E 

60.0 

63.8 

63 8 

63.8 

60.0 

60.0 

63.8 

63.8 


F 

60.2 

58.2 

63.8 

63.8 

63. S 

63.8 

63 8 

63.8 


G 

63.8 

62.5 

64.8 

64.8 

65 8 

65 8 

63 8 

65.8 


H 

63.8 

63 8 

64.8 

63 8 

63 8 

63 8 

65.5 

65.5 


75.0 OCTANE NUMBER FUEL 



Rating number 


1 

2 

3 

4 

5 

6 

7 

8 


A 

75.0 

75.0 

73.0 

73.0 

77.0 

76.0 

76.0 

76.0 


B 

75.0 

75.0 

75.0 

71.4 

75.0 

75.0 

75.0 

77.0 


C 

75.0 

75.0 

75.0 

75.0 

75.0 

75.0 

74.3 

75.0 

Car 

D 

75.0 

75.0 

77.0 

77.0 

77.0 

77.0 

77.0 

77.0 

E 

75.0 

75.0 

75.0 

73.3 

75.0 

77.0 

77.0 

74.0 


F 

75.1 

75.1 

75.1 

75.1 

75.1 

75.1 i 

75.1 

78.0 


G 

73.0 

75.0 

75.0 

75.0 

75.0 

75.0 

75.0 

75.0 


H 

75.0 

77.0 

75.0 

72.3 ; 

77 0 

75.0 

75.0 

77.0 


One of the author’s discoveries from these data is that at light knock 
intensity (75.0 octane number) the variation in knock rating appears less 
(the standard deviation is 1.2) than for the heavier knock intensity 
(standard deviation is 3.0). To this they add “ No particular signifi- 
cance is attached to the variations of the standard deviations from car 
to car * * * because of the relatively small amount of data available.” 

Let us examine this opinion. We find L\ values of 0.48 and 0.65. 
The 5 per cent level of L\ for both fuels is about 0.71. Hence for both 
grades of fuel the variation in s from car to car is statistically significant. 

NOTES 

2.20 Cross-product terms in the analysis of variance. In setting the total 
sum of squares equal to the sum of sums of squares associated with specific 
factors, certain cross-product terms were assumed to be zero. It is easy to 
show that they are zero. Thus for the breakdown shown on page 70 

£(x - x) 2 = B(x* - x) + (1 rZ - x ,) + (X - x T i ) p 

- H(Xi - X) 2 + Z(Xr* - Xz) 2 + 2(z - Zi) 2 
+ 2£(Xz - X)(Xri - X) + 2£(Xz - X)(X - Xri) 
+ 2L(Xr! - Xl)(X - Xrl) 
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Omitting the common multiplier 2, the cross-product terms may be written 
(Xl - X)Z(Xrl - X) + (Xl - X)£(X - Xrl) + ( Xrl ~ 2<)£(X -Xrl) 
The first term is an abbreviated statement of 

(X k - X)[(X nh -X) + (X Hh - X) + ■ • • ] 

+ (X k - X)[(X Tlh ~X) + (Xr 2h -!)+■■■] 

+ ••• 

The expression outside each bracket is constant for all terms within that 
bracket. The terms within each bracket represent the sum of the deviations of 
variates around their own mean; hence each bracket is zero. 

Similarly, each of the cross-product terms for the breakdown favored on 
page 73 is zero. We have 

£(x- X) 2 = £[(Xc- X) + {Xl- X) + (Xu- Xl- Xc+X) + (. x-Xu )] 2 

= E(Xc- xy+Z(Si- xy+ Z(Xu- Xi- X c + x) 2 +£(x-x Zc ) 2 
+ 2 E(Xc- X)(Xi- X) + 2£(X X) (Xu- Xl- X e + X) 

+ 2 E(Xc- X)(X- Xu) + 2 £(X,- X) (Xu- Xl- X c +X) 

+ 2 Z(Xi- X)(X- Xu) + 2 Z(Xu- Xl- Xc+ X) (X- Xu) 

By the arguments used in the preceding example the third, fifth, and sixth 
cross-product terms are zero. Omitting the factor 2, the remaining cross- 
products may be written 

£(X C - x)(Xic - X c ) - £(Z C - XKXi - x) 

+ Z(Xi - X)(Xu - Xl) 

As in the previous example the second and fourth terms are zero. The 
other term may be expanded with one term in parentheses outside the summa- 
tion. The term within the summation is the sum of the deviation of variates 
around their means and hence is zero. 

2.21 Distribution of F . The argument underlying the comparison of an 
index of variation due to a suspected “ cause ” with a similar index associated 
with unknown (= chance) causes is the following: 

If an indefinitely large number of random samples are drawn from a known 
population, a sample statistic (such as the mean or variance) will have a 
continuous distribution curve which can often be exactly determined by 
mathematical procedure rather than be approximated by any amount of 
experimental sampling. For example, if an indefinitely large number of 
random samples each of n independent observations are drawn from a normal 
population of variance a 2 , and if for each sample the statistic 
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is computed, the distribution of x 2 , with n — 1 degrees of freedom, is 


(7^2(n-~3)/2 e -x2/2 ^2 


where C is a constant. 

Similarly, if we have two unbiased estimates cl and c\ of the variance 
cr 2 of a normal population where 


a 2 


<Jz = 


TAX - x ) 2 

f* 


^ TAJ - P ) 2 


f x and f y being the number of degrees of freedom on which each estimate is 
based, the ordinate of the distribution of the ratio 


F = 


~2 

cr x 


is found to be 


jp(/l-2)/2 ' 

* ( fuh ) ' W+h 


^ being known. This distribution is found in the following way: The dis- 
tribution of cl (for a given value of cl) is known. Similarly for The 
distribution of their ratio F is found by multiplying the distribution function 
of cl (for a given cl) by the distribution function of cl and integrating the 
product over all values of cl (0 to co ). Table VIII gives values of F, for vari- 
ous/! and/ 2 , beyond which lies 5 per cent (and 1 per cent) of the area under the 
curve of F } i.e., values of F satisfying the equation 


0.05 

or 

0.01 


x 


]?(/l-2)/2 

p 4/{jhh) ' (jiF+hY h+hmdF 


To summarize: we compute the ratio F from the data. We then determine 
the probability in random sampling from a normal population that the com- 
puted ratio would be exceeded. If this probability is small (0,05 or 0.01) we 
conclude that the mean square in the numerator of F is significantly greater 
than the true estimate of c 2 furnished by the denominator; the mean squares 
could not have arisen from the same normal population — variation associated 
with that cause is statistically significant. 

2.22 Estimating cK We now outline a proof that the mean squares given 
in the last column of each analysis of variance table are unbiased estimates 
of the population variance <r 2 . 

The method has been used by Irwin (22). Let N observations Xu be 
divided among R rows and C columns, RC = AT. 
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Column 

1 2 ••• C 

1 Xu X12 • • • X\ e 

2 Z21 Z22 * * * Z2 o 

Row 


R 


Xrl 


Z r 2 • ' • Z 


TO 


Source of 
variation 

Sum of squares 

Degrees of 
freedom 

Mean square 

Hows 

M 

"Sii 

1 

to 

R - 1 

E(Zr - Z) 2 

R — l 

Columns 

E(z c - Z) 2 

C - 1 

E(Zc-Z) 2 

C - 1 

Interaction 

E(Z - Z r - Z c + Z) 2 


E(Z - x T - X c + Z) 2 

(R — 1)(C — 1) 

Total 

E(Z - Z) 2 

RC - 1 

E(z - Z) 2 

£C - 1 


It has already been shown that 

S(x - xy 

RC - 1 

is an unbiased estimate of tr 2 , based on RC — 1 degrees of freedom. Now for 
the variance jlue to rows; write the expected value (mean value over all 
samples) of ( Xr — X) 2 as follows: 

(Z - X') - (X - Z')jJ 

Expanding the right-hand side and writing down the expected values of the 
resulting three functions, for example, 

EIE(X - l') 2 ] = a 2 

we find 

Z[E(Z, - Z) 2 ] = cr 2 (r - 1) 
or 

E(Zr - Z ) 2 


[ 7 ] 


E[(Z r - Z) 2 ] 


= E [z 


R - 1 
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is an unbiased estimate of a 2 based on R — 1 degrees of freedom. 

X is normally distributed, hence the mean X r is also normally distributed; 
the distribution of 


Idr ~ X) 2 

B - 1 


is essentially that of s 2 . 
Similarly, 


XXgc ~ X ) 2 
C ~ 1 


is an unbiased estimate of a 2 based on c — 1 degrees of freedom. 
In the case of the sum of squares 


£(X - Xr - 2 e + X) 2 

we have 

E(X -Xr-Xc + X) 2 = 

E[(X - X') - (Xr - 2')- (2 C - 2') + (2- X')] 2 


After expansion and use of [7], we find 

XE(X -2r-2c + X) 2 ] = (R - 1 )(C - Do- 2 
or 

£(X - X r - X e + X) 2 

(R - 1 )(C - 1) 


is an unbiased estimate of a 2 based on (R — 1)(C — 1) degrees of freedom. 



CHAPTER III 

RELATIONSHIP AMONG VARIABLES 


3.1 Introduction. In several sciences, for example, physics, relation- 
ships among variables are often stated in exact functional form. Thus 
the relationship between time and distance for an object falling in a 
vacuum is written simply as s = % gt 2 , and it is implied that s is exactly 
determinable from t 

Without questioning the validity of this practice in physics, it is 
apparent that it is not valid in industrial research. Both the nature of 
industrial experimenting and the impracticability of duplicating the rela- 
tively controlled conditions of physics laboratories bring about this 
result. The hardness and tensile strength of one aluminum casting 
may be respectively X and F, while for a second apparently identical 
specimen we find X and 1.2F. Specification of F from X is subject to 
error, he., from knowledge of X we can estimate only the average 
(expected) value of F, not the value which is actually observed. 

3.2 Types of relationships. Such relationships among variables can 
be classified. If there are two variables whose relationship is described 
by a straight line, the term linear regression is used to describe the rela- 
tionship. If the relationship is parabolic, say F = aX + iX 2 where 
a and 6 are constants, the term curvilinear regression is used. For a 
relationship among more than two variables, such as F = aZ7 + bV, 
where a and b are constants, the term multiple regression is used. The 
relationship among any k of n related variables, the remaining n — k 
variables being in the simplest case, held constant, is described as partial 
correlation or partial regression. We shall presently illustrate simple 
linear and simple multiple regression. 

33 Uses of regression analysis. Regression analysis is useful 
wherever hypotheses dealing with relationships are examined. To give 
a few examples: in agriculture the relationship between crop size and 
tree injury has been studied; in medicine, studies of the relationship 
between vitamin potency and weight gain have successfully used regres- 
sion analysis; in economics, and other social sciences where perhaps 
serious questions regarding the validity of the technique can be raised, 
the use of regression analysis has been extensive and there seems to be 
hardly any group of variables to which simple, multiple, and partial cor- 
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relation and regression analysis has not been applied. In industrial 
research regression analysis can be used in the search for inexpensive 
methods of testing as replacements for more expensive methods. We 
shall give several examples of y j 
this usage. 

3.4 General procedure. 

Let it be presumed that two 
variables are linearly related. 

We construct the best fitting 
straight line. 



Y r = a + bX 


X 


The total variation of say the 7-value of an observation about its 
mean T can be represented by 

(Y — 7) 


where Y represents an individual observation. This may be divided 
into two parts: first, a part explained by the relationship of 7 to X, and 
second, any remainder. 

Consider the first part. If 7 and X are unrelated, the expected or 
best value of 7, given X, is 7. If 7 and X are related, the best esti- 
mate of 7, given X is 7 r , the ordinate of the regression line at X. The 
greater the relationship between the two variables 7 and X, the greater 
the superiority of Y t over 7 as an estimate of 7, for given X. Hence 
the term 

(Y r - Y) 


represents that part of the total variation allocable to regression. 

The remainder consists of variation unallocable to regression, and 
inasmuch as there are no other specific factors to which such variation 
can be allocated, this term is of purely residual character. It consists 
of the variation of observations about the regression line and is given by 

(Y - Y r ) 


In measuring these three classes of variation, it might be supposed 
that we use 

Z(Y - Y) m £(7 r -?) + Z(Y - Yr) 

the summation extending over all n values of Y or Y r . This equation is 
valid but not useful, for D(7 - Y) is always zero regardless of the 



98 


INDUSTRIAL STATISTICS 


amount of the variation of F; about F. If, however, variation is meas- 
ured by the squares of deviations we shall have 

£(7 - 7) 2 == L(7 r - 7) 2 + £(7 - 7 r ) 2 
as will presently be proved. 

To obtain three estimates of the population variance from these sums 
of squares, it is necessary to introduce, as before, the idea of degrees of 
freedom. There are n values of 7 in £ (7 — 7) 2 less the grand mean 
which is computed from 7, or n — 1 degrees of freedom. There are n 
values of 7 in £(7 — 7 r ) 2 less the computed constants of Y r (a and 
b ), or n — 2 degrees of freedom. By subtraction there remains only 
one degree of freedom for the linear regression estimate based on 
£(7 r — 7) 2 , which is as it should be for the two constants of the regres- 
sion line (a and b ) are restricted by 7. 


Source of variation 

Sum of squares 

Degrees 
of freedom 

Mean square 

Linear regression 

S(Fr“ U) 2 

1 

*2 X(F, -f ) 2 

vx= 1 

Residual 

£(Y- F r ) 2 

n — 2 

*2 E(Y-Yr) 2 

n — 2 

Total 

Z ( u ~?) 2 

n — 1 



The test of significance is again the F test. If there is no real linear 
relationship between 7 and X, the two mean squares should be the 
same. If, however, the value of F in 


is sufficiently large, the regression is real. 

Thus if we make many drawings (of size n) from a bowl of chips, 
each chip being marked with two numbers, one being X and one 7, 
the distribution of X and 7 being bivariate normal, with the overall 
correlation between X and 7 in the bowl being zero, and if from each 
drawing we construct and of, the distribution of their ratio F for 
all drawings of size n could be determined. It is this distribution whose 
values are shown in Table VIII. For example, for n - 20 only 5 per 
cent of such drawings will yield values of F exceeding 4.38 (1 and 19 
degrees of freedom). 
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The results obtained in an ordinary industrial experiment are judged 
by the probabilities given in Table VIII. Thus if we find F — 10 for 1 
and 19 degrees of freedom, we conclude that there is less than 1 chance 
in 100 that $i and of are estimates of the variance of the same homo- 
geneous normal population. The hypothesis is rejected and we shall 
conclude that the populations are not identical and that the regression 
is real. 

3.5 Fitting the regression line. The criterion we shall use in fitting 
the regression line Y r — a + bX is the following: if one of the two vari- 
ables X and Y (say Y) is subject to error while the other is not, the 
sum of the squares of vertical distances from the Y’ s to the corresponding 
Y r , i.e., £(F — Yr) 2 , is to be a minimum. This requirement yields two 
equations which are solved for a and 6. This and other properties of a 
regression line fitted to observations by the method of least squares will 
be developed later. 

^^3.6. Examples of linear regression. Brenner (4) gives the following 
data on the thickness in hundred-thousandths of an inch of non-mag- 
netic coatings of galvanized zinc on 11 pieces of iron and steel: 


Thickness as measured 

BY STANDARD DESTRUCTIVE 
STRIPPING METHOD 

Y 

116 

132 

104 

139 

114 

129 

720 

174 

312 

338 

465 


Thickness as measured 

BY NON-DESTRUCTIVE 
MAGNETIC METHOD 

X 

105 

120 

85 

121 

115 

127 

630 

155 

250 

310 

443 


Measurement of thickness by stripping is accurate but the tests are 
destructive and costly. The magnetic method is less costly. Do the 
data support the belief that we may measure X and use Y r as an esti- 
mate of Y where 


a and b being constants? 


Y r = a + bX 
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The following is a r<§sum6 of the procedure described in 3.4. The 
significance of a straight line 

Y r = a + bX 

fitted to data by the method of least squares, may be tested by deter- 
mining the probability, in random sampling from a normal population, 
that the computed value of 



would be exceeded, where 

Z(F r -F) 2 


is the mean square attributable to regression, with 1 degree of freedom, 
and 

. 2 Z(F-F r ) 2 

<?2 = g 

n — 2 

the residual or chance mean square, not accounted for by regression, 
with n — 2 degrees of freedom. 

F is the mean of F and n is the number of pairs of observations. 

The F values appear not to have come from a normal population but 
experimental evidence on this point indicates that the F test can 
probably be used. The points are plotted in the following diagram. 
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Using the method of least squares to determine a and &, 

ZX 2 LF - LX£xr 
° ttEX 2 - (EX ) 2 

nZXY - XXZY 
nZX 2 - (EX ) 2 

From the data, 

EY= 2,743 
EX = 2,461 

ELXY = 952,517 

Z.Y 2 = 1,067,143 
EX 2 = 852,419 
71 *= 11 

from which 

a = -1.7948 
6 = 1.1226 

Using these values of a and b, the predicted values of thickness calcu- 
lated from a knowledge of the magnetic readings X are shown in the 
following table. 


Predicted values op 

True values op 

THICKNESS 

THICKNESS 

Y r - a + bX 

Y 

116.08 

116 

132.92 

132 

93,64 

104 

134.04 

139 

127.31 

114 

140.78 

129 

705.43 

720 

172.21 

174 

278.86 

312 

346.21 

338 

495.52 

465 


Is the discrepancy between these pairs of values small, from a statisti- 
cal point of view? The answer is found, i.e., the adequacy of a linear 
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equation is determined, by comparing that part of the total variability 
for which the line can account with remaining unaccountable (chance) 
variability. 


Source of variation 

Sum of squares 

Degrees 
of freedom 

Mean square 

Regression 

ZWr ~ ?) 2 

1 

Z(Fr- Y) 2 /l 

Residual 

Z(F~ Y r ) 2 

n — 2 

Z (F - Y r )*/n - 2 

Total 

Z (Y-?) 2 

n — 1 



We have 


E(F - P ) 2 = zr 2 - 


(OT 


E(F r - F ) 2 = ZF 2 - 


(ZF ) 2 


from which, by subtraction, 

Z(F — F r ) 2 


1 , 067,143 - 
383 , 138.54 
1 , 064 , 356.41 
380 , 377.41 


( 2 , 743) 2 

11 


( 2 , 743) 2 
11 


2 , 761.13 


Source of variation 

Sum of squares 

Degrees of 
freedom 

Mean square 

Regression 

Residual 

380,377.41 

2,761.13 

1 

9 

380,377.41 

306.79 

Total 

383,138.54 

10 



If the regression line is inadequate, the mean square due to regression 
will not be significantly larger than the residual or chance mean square. 
In our case, the ratio is 


380 , 377.41 

306.79 


= 1240 


Table VIII gives the values of F which, with one and nine degrees of 
freedom, are necessary in order to establish regression as (1) sig ni fica nt 
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(2) highly significant. F need be only 10.56 for the regression to be 
highly significant. Since we have F = 1240, our estimates of thickness 
from magnetic measurements by use of the equation 

Y r = —1.7948 + 1.1226X 

are statistically sound. The practical man must now decide if the dis- 
crepancies between Y and Y r are sufficiently small, from the point of 
view of the use to which the product is put. 

Jennett and Dudding (24) give the following 11 observations on life 
tests of electric light bulbs and tests on filament wire. 


Life of bulb 
in HOURS 

Y 

1,605 

1,120 

1,320 

1,225 

1,055 

1,390 

1,385 

1,700 

2,070 

1,395 

1,105 


Quality test 

ON FILAMENT 

X 

276 

293 

288 

315 

305 
315 

306 
286 
289 
296 
335 


A life test required about 1000 hours and cost about $5 per bulb. 
The wire test is quickly performed and is lower in cost. If only wire 
tests are made, can life be estimated from 

Y r = o 5X 

As before, we have 

ZY = 15,370 

LX = 3,304 

LX7 = 4,588,135 
EX 2 = 995,238 

n = 11 

from which a = 4,410.296 and b = —10.031. 
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Source of variation 

Sum of squares 

Degrees of 
freedom 

Mean square 

Regression 

285,429 57 

1 

285,427.83 

Residual 

617,238.62 

9 

68,582.01 

Total 

902,668.19 

10 

. . . 

Finally 

F = 

4.1 



From Table VIII, with one and nine degrees of freedom F need be as 
high as 10.56 in order for the linear regression to be highly significant 
(1 per cent level) or 5.12 for significance at the customary level (5 per 
cent). F = 4.1 does not meet these requirements. Linear regression 
does not account for a sufficient part of the total variability; it is not 
adequate for the purpose of prediction. The residual variability about 
the regression line 

L(F- Fr ) 2 

n — 2 

is too large. Values of Y r are calculated from 

Yr = 4,410.296 - 10.031X 
and the following table shows Y r and Y : 


Pbedicted values 

Actual life 

OF LIFE 

TESTS 


Y 

1,641.7 

1,605 

1,471.2 

1,120 

1,521.4 

1,320 

1,250.5 

1,225 

1,350.8 

1,055 

1,250.5 

1,390 

1,340.8 

1,385 

1,541.4 

1,700 

1,511.3 

2,070 

1,441.1 

1,395 

1,049.9 

1,105 


The correspondence of Y r and Y is not sufficiently high. 

It is impossible to state flatly that a sample of 11 is too am all but 
samples perhaps of 40 or more observations are desirable if tentative 
conclusions of industrial importance are to be drawn from the results. 
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As an example from chemical research work, consider the following 
data given by Thomsen (41); we wish to predict titer values from the 
iodine values of fatty acids. 


Y 

X 

7 

X 

Titer 

Iodine ' 

Titer 

Iodine 

value 

value 

value 

value 

(minus 40) 

(minus 40) 


(minus 40) 

2.5 

7.2 

2.0 

14.2 

2.0 

10.3 

1.8 

10.3 

3.0 

7.9 

5.4 

0.5 

3.2 

8.2 

3.1 

9.4 

2.1 

13.7 

4.8 

1.4 

4.8 

1.9 

3.7 

5.9 

2.1 

13.0 

0.1 

16.4 

1.5 

17.5 

1.3 

17.3 

4.8 

0.3 

4.8 

0.8 

3.8 

7.3 

1.3 

16.4 

2.2 

12.1 

0.0 

12.2 

0.4 

18.5 

5.0 

2.5 

6.6 

-1.3 

2.4 

13.4 

4.3 

1.9 

5.7 

0.3 

2.0 

13.8 

3.9 

4.1 

1.2 

15.1 

2.1 

11.4 

2.3 

8.8 

0.3 

21.7 

3.2 

6.3 

3.5 

5.7 

1.2 

15.4 

4.3 

3.5 

5.3 

-0.2 

4.5 

3.8 

4.0 

0.7 

2.2 

11.1 

2.9 

9.1 

3.4 

7.6 

3.0 

8.8 

2.3 

2.8 

2.9 

9.1 

4.9 

1.9 

3.1 

8.8 

1.7 

17.8 


We have 

ZY = 148.9 

ZX = 426.6 

ZXY - 853.37 

ZY 2 = 561.77 

ZX 2 = 5,387.08 
n = 50 
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The accompanying table shows the analysis of variance. 


Source of variation 

Sum of squares 

Degrees of 
freedom 

Mean square 

Regression 

99.5407 

1 

99.5407 

Residual 

18.8051 

48 

0,3918 

Total 

118.3458 

49 



Forty has been subtracted from the observed values of both X and Y; 
this facilitates computation, for the smaller numbers are easier to handle 
and it does not affect the numerical analysis. 

We have F — 254.08. The 1 per cent value of F is 7.19; the regres- 
sion (which is negative, i.e., high values of one variable are generally 
associated with low values of the other variable and vice versa) is 
highly significant. 

3.7 Linear regression in grouped data. In the previous examples we 
had at most 50 observations, so there was no reason to group the obser- 
vations. In the present example there were originally 440 observations ; 
they have been grouped into classes in the table below. In such a 
case the analysis is slightly more complex. 

It is here assumed that the variances of all eight col umn s are 
equivalent, within the limits of chance variation. This ass ump tion, 
which may be checked by the Li test, must be met for the F test to 
be valid. 

The British Cotton Industry Research Association (5) gives the 
following data on the frequency of warp breaks in weaving, classi- 
fied according to values of an important influence, namely, relative 
humidity. 


Relative humidity (. X ) 



68- 

70- 

72- 

74- 

76- 

78- 

80- 

82- 

Total 


0.0- 




2 

5 

5 

1 


13 


0.8— 


1 

7 

13 

28 

28 

11 

4 

92 


1 . 6 - 

2 

6 

16 

27 

44 

35 

9 

1 

140 

1 

2.4- 

1 

5 

24 

24 

27 

17 

3 

2 

103 

S 

3.2- 


2 

16 

6 

15 

9 

2 

1 

51 

Cu 

4.0- 

2 

1 

4 

7 

7 

5 

1 


27 

1 

4.8- 


1 

2 

2 

3 

2 



10 

£ 

5.6- 

; 


1 






3 


6,4- 

i 








1 


Total 

5 

16 

70 

81 

129 

104 

27 

8 

440 
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Would the regression line 

Y r = a + bX 

enable us effectively to predict warp breakage, or is the residual vari- 
ability too great? 

We shall first proceed as before. 

'EY - 1,063.2 

EX = 33,446 

EXY = 80,524.4 

EY 2 = 3,110.4 

EX 2 = 2,545,724 

n — 440 

from which 

a = 9.028 

b = -0.087 

The analysis of variance is given in the following table. 


Source of variation 

Sum of squares 

Degrees of 
freedom 

Mean square 

Regression 

Residual 


1 

438 

25.51 

1.18 

Total 

541.32 

439 

... 


An F test indicates significant regression. But the sum of squares 
not due to regression, E(Y — Y r ) 2 , consists of two parts — the sum 
of squares of the deviations of the column means Y c about the 
corresponding Y r , i.e., E(Y C — F r ) 2 plus the variability within 
columns E(X ~ Y c ) 2 , which is more truly the chance or unallocable 
variability. 

The total residual degrees of freedom were n — 2. The unallocable 
part £(F — P c ) 2 uses k means computed from the data: hence the 

£ (Y — F e ) 2 

unallocable mean square is — . By subtraction, the devia- 

71 *“* K 

tion-from-regression mean square has k — 2 degrees of freedom. 
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Thus for linear regressions calculated from grouped data, the follow- 
ing subsidiary variance analysis may be useful 


Source of variation 

Sum of squares 

Degrees of 
freedom 

Mean square 

Deviation of means 
from regression 

Z(Pc - Y r ) 2 

k- 2 

X(P C - y r )* 

k - 2 

Unallocable part of 
residual (chance) 

ZiY-fc)* 

n — k 

Z(.Y-?c ) 3 
n — k 

Residual 

; 

ZiY -Y r )* 

n — 2 

... 


We may compute the “chance” sum of squares and then determine 
the deviation-from-r egression sum of squares by subtraction. 

It should be observed that all summations extend over the entire data. 
Hence, each mean must be counted as many times as there are observa- 
tions from which that mean was computed. This was discussed in 
Section 2.10. 


Source of variation 

Sum of squares 

Degrees of 
freedom 

Mean square 

Deviation of means 
from regression 

3.37 

6 

0.57 

Unallocable part of 
residual (chance) 

512.44 

432 

1.19 

Residual 

515.81 

438 

1 


The earlier F test 

_ 25.51 

F ~ Tis 

for 1 and 438 degrees of freedom showed the regression to be highly 
significant. The present value of F is the ratio of 25.51 to 1.19, for 1 
and 432 degrees of freedom and the conclusion originally reached is 
shown to be valid. In the present example the subsidiary analysis of 
variance was uninformative. If, however, the original test showed the 
regression not to be significant, any reduction of the residual mean square 
by elimination of the (possibly) allocable element (deviation from regres- 
sion) might show the regression to be significant, and the latter infer- 
ence would be the proper one. 
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NOTES 

3.8 Breakdown of 23 (F - F) 2 . The accompanying graph may be helpful. 
It is apparent from the graph that an observation F may always be written 

Y = Y + (Y r - Y) + (Y - Y r ) 

the first term to the right of the equality sign (?) being common to all obser- 
vations, the second term (F r — F) representing that part of the value of the 
observation F attributable to regression, and the last term (F — F r ) repre- 



senting the unallocable (and therefore dealt with as chance) variability about 
the line of regression. The total deviation (F — F) thus consists of two 
parts, regression (F r — F) and residual (F — F r ). 

(F - F) - (F r — F) + (F — F r ) 

Summing this linear expression merely leads to zero = zero. We must show that 

[1] L(F - F) 2 = 23(F r - F) 2 + 23(F - F r ) 2 

We have 

[2] 23(F - F) 2 - £[(F - Fr) + (Fr - ?)] 2 

- Z(F - Fr) 2 4* L(Fr - F) 2 4- 223(F - F f )(F r - F) 

We now show that if the regression line 

Y r = a + bX 

is fitted to the data by the method of least squares, the cross product term of 
[2] is zero and [1] is valid. 

3.9 The method of least squares. In the method of least squares, the 
function F r = a 4- is fitted to the data so that 

E(F - Y r )* 
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which will be designated by <p, is minimum. The necessary conditions are 

f Z(F - a - bX) 2 = 0 
da 

A - bX)2 = 0 

00 


Differentiation yields 

-2£(7 - a - bX) - 0 
[3] 

-2 £(F - a - 6Z)Z = 0 

It is apparent that dV/da 2 and d 2 <p/db 2 are always positive, i.e., [3] are con- 
ditions for minimum v>. [3] may be written 

£F = na + bZX 

SFZ = oZZ + &ZZ 2 

These “ normal ” equations may be solved simultaneously for a and b, the 
F-axis intercept and the slope of the regression line. 

We return to the cross-product term of [2], which may be written 

Z(F - Yr)Yr - F£(F - Yr) 

= Z(F - YrKa + bX) - F£(F - Y r ) 

M = aSO" - Y r ) + 6L(F - F r )Z - F^CF - F r ) 

From [3] it is clear that [4] is 0. Hence [1] is valid. 

3.10 Curvilinear regression. If a curvilinear regression function, say 
the parabola 

Y r = a + bX + cX 2 

(with two degrees of freedom) or an m-degree function (with m degrees of 
freedom) is fitted to the data by the method of least squares, the cross-product 
term is easily shown to be 0. The normal equations for determining a } b } and 
e would be 

'EY - na + 6£X + c£X 2 
£FX = a£X + 5£X 2 + c^X 3 
EFX 2 = a^X 2 + &£X S + cEX 4 

The extension to the general case of an mth degree polynomial is obvious. 
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A high degree regression function may fit a set of observations “ better ” 
than a low degree function, but the error term may be increased by loss of 
degrees of freedom. Lower degree functions, such as the straight line or 
the parabola, are to be preferred for it is best to formalize observed relation- 
ships as simply as possible. For it is often impossible — particularly in 
industrial research — to rationalize (i.e., to explain by recourse to available 
theory) any but these simpler relationships. Higher-degree regression func- 
tions may lead to new theory, but the use of simpler relationships is in keeping 
with the conservative methodology of experimental science. 

3.11 Degrees of freedom. A basic explanation of the allocation of the 
total number of degrees of freedom is beyond the possibilities of this book, 
but the following may be helpful. The quantity X(F - F) 2 summed over 
ft observations has n — 1 degrees of freedom, for the mean f has been calcu- 
lated from the n observations. The regression sum of squares X(F r — F) 2 
may be written 

£(F r - F) 2 = Z(Y + b(X - X) - F) 2 
= 6 2 Z(X - X) 2 

X — X is independent of any correlation between X and F. Hence variability 
in X)( F r — F) 2 depends on 6; accordingly only one degree of freedom is 
allocated to regression. The residual sum of squares X(F — F r ) 2 absorbs the 
remaining ft — 2 degrees of freedom. 

3.12 Linear regression in grouped data. The accompanying diagram 
refers to the problem of linear regression in grouped data. 


*F 



Y - F + (F r - F) + (F - F r ) 

= F + (F r — F) + (Ye — F) + (F — Y c ) 

To show that 

[5] X(F - Fr) 2 = X(Fc - F r ) 2 + X(F - Ye) 2 
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write 

£(F - Y r ? = £[(?„ - Yr) + (F - Y c )f 

= £(F C - Yr? + £(F - ? c ? + 2£(F C - F r )(F - ? e ) 

But 

£(F C - F r )(F - Y c ) 

= EF c (F - Y c ) - £F r (F - ? c ) 

= 0-0 

or [o] is valid. 

3.13 Computation of sums of squares for linear regression analysis. To 

compute — Y ) 2 we may use the expansion 


£(F r - Y? = ZYl - 2 F£F r + £F 2 


= £Fj - n? 2 = £F, - 

n 

foj, from the normal equations 

ZF r _ y 

n 


It is, however, not necessary to calculate Y r to find ]T(F r — f) 2 for 
Z(Yr - Y? = £(o + bX - Y? 

= 6 2 £(X - X? 


£ZF 


£Z£F" 




, _ (ZK?1 


J 


Convenient expansions for the other sums of squares are 
£(F - ?? m £F 2 - nf 2 = £F 2 - 


('EY? 


n 
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and 

E(F - Y r ) 2 = E(F - F,)(F - F r ) 

= E(F - F r )7 - «L(F - F r ) - 6E(F - F,)J 
= E(F - Yr)Y 0 - 0 

= £y 2 - aEF - 6E^F 

3.14 Regression and prediction (see Eisenhart, 12). The regression line 

Y r = a + bX 

obtained by minimizing 

L(f - y r ) 2 


X being the independent variable, differs from the regression line X T = c + dY 
obtained if 

E(X “ *r) 2 

^minimized. The decision as to which line (or curve) is appropriate (i.e., 
which of the two variables, X and F, is to be considered independent) depends 
not on what we would like to predict, but on which of the two variables, 
X and 7, is free from error. If w T e are studying the relationship between 
quality of output (F) and time (X), the latter will generally be represented by 
values (selected in advance of measurement of F) say, at daily or weekly inter- 
vals; such selected values are free from error. Measurements of F are, how- 
ever, subject to error and the appropriate regression line would be that which 
minimized 2(F — F r ) 2 , that is 

Y r = a + bX 


This is the linear regression of F on X. It is important to note that in the 
theory of regression, only the dependent variable (F in the above example) 
is required to be normally distributed. The correctness of regression analysis 
is unimpaired by the fact that the X values are arbitrarily selected, for ex- 
ample, uniformly spaced. 

Many problems in industrial research do not lend themselves to a clear-cut 
decision as to which variable is free from error. In the example on titer and 
iodine, neither variable appears to have been selected, and both X and F vary 
normally and one regression line is apparently as good as the other. A better 
solution to this problem has been suggested by Wald (44). 

Occasionally the variable to be estimated, say F may not be subject to error 
whereas X is subject to error. In this case we would first obtain the regression 
of X on F 


Xr^c + dY 


and determine Y from 


X r — c 
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3.15 Example of multiple regression. Fulweiler, Stang, and 
Sweetman (18) give the following data on worn wire rope of nominal 
diameter Yl to Y% inch. Can tensile strength be estimated from a 
linear relationship connecting U and V ? 


Tensile 
strength in 
thousands of 
pounds per 
square inch 

Number 

of 

broken 
wires in 
worst lay 

Length 

of 

worn 

surface 

Tensile 
strength in 
thousands of 
pounds per 
square inch 

Number 

of 

broken 
wires in 
worst lay 

Length 

of 

worn 

surface 

Y 

U 

V 

Y 

U 

V 

174 

8 

0.14 

178 

9 

0.14 

185 

0 

0.00 

185 

12 

0.14 

188 

8 

0 12 

172 

0 

0.00 

160 

14 

0.11 

158 

8 

0.11 

179 

0 

0 00 

162 

14 

0.11 

183 

0 

0.11 

176 

0 

0.00 

191 

0 

0.09 

192 

29 

0.13 

177 

0 

| 0.00 

198 

37 

j 0.13 

183 

0 

I 0 12 

158 

0 

0.00 

186 

0 

0.13 

152 

6 

0.10 

180 

0 

0.19 

136 

7 

0.08 

184 

3 

0.15 

174 

0 

0.00 

175 

5 

0.13 

180 

2 

0.15 

175 

0 

0.00 

172 

5 

0.15 

166 

2 

0.16 

172 

0 

0.00 

170 

14 

0.15 

174 

0 

0.13 

180 

0 

0.00 

162 

5 

0.13 

181 

0 

0.12 

165 

0 

0.00 

201 

12 

0.15 

153 

12 

0.11 

172 

0 

0 00 

111 

22 | 

0.11 

184 | 

5 

0.16 

173 

0 

0.00 

145 ! 

11 

0.15 

177 

0 

0.09 

172 

0 

0.00 

180 

19 

0.10 

133 

8 

0.11 

172 

0 

0.00 

157 

21 

0.14 

181 

0 

0.10 

175 

0 

0.00 

138 

14 

0.10 


3.16 Normal equations. The theory of the previous chapter applies; 
a multiple regression function of the linear form' 

Y T = a + bU + cV 

has two degrees of freedom, i.e., as many degrees of freedom as there are 
independent variables. The residual variance has n — 3 degrees of 
freedom, three degrees of freedom being lost by the calculation of a 
(or ¥), h, or c from the data. If this regression plane is to be fitted to the 
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ungrouped data by the method of least squares, we shall have the 
following normal equations: 

an + bZU + cZV =£F 

aZU + bZU 2 + cZVU = ZYU 

aZV + bZVU + cZV 2 = Z YV 

from which a, b, and c may be found. 

We have 

ZY = 8,907 
ZU = 312 

ZV = 4.54 

ZVU = 39.05 

ZYU = 52,335 



ZYV = 

778.41 



ZU 2 = 

5,372 



ZV 2 = 

0.5926 



n ~ 

52 


from which 

a = 

171.258 



6 = 

-0.4133 



c = 

28.750 


The analysis of 

variance follows: 



Source of variation 

Sum of squares 

Degrees of 
freedom 

Mean square 

Regression 

479.323 

2 

239.66 

Residual 

13,935.350 

49 

284.39 

Total 

14,414.673 

51 



The sums of squares may be computed from any two of the following: 
(YY) 2 

Z(Y-Y)? = £F 2 --^-^- 


Z(Y r - T) 2 - ZY? - n? 2 = aZY + bZYU + cZYV - 
E(F - F r ) 2 = ZY 2 - aZF - &ZFE7 ~ cEFF 













116 


INDUSTRIAL STATISTICS 


The mean square due to regression is seen to be even less than that not 
associated with regression. No F test is necessary; clearly the relation- 
ship 

Y = 171.258 - 0.4133 U + 28.7507 

is inadequate. 

3.17 Further examples. The following data on the tensile strength, 
hardness, and density of 60 specimens of die-cast aluminum are given 
by Shewhart (36). 


Tensile 

strength 

(pounds 

per 

square inch) 

29,314 

34,860 

36,818 

30,120 

34.020 
30,824 
35,396 
31,260 
32,184 
33,424 
37,694 
34,876 
24,660 
34,760 

38.020 
25,680 
25,810 
26,460 
28,070 
24,640 
25,770 
23,690 

28,650 
32,380 
28,210 
34,002 
34,470 
29,248 
28,710 
29,830 


Hardness 

(Rockwell 

E) 


53.0 

70.2 

84.3 

55.3 

78.5 

63.5 

71.4 

53.4 

82.5 

67.3 

69.5 

73.0 

55.7 

85.8 

95.4 

51.1 

74.4 

54.1 

77.8 

52.4 

69.1 

53.5 

64.3 

82.7 

55.7 

70.5 

87.5 

50.7 

72.3 

59.5 


Density 
(grams 
per cubic 
centimeter) 


2.666 

2.708 

2.865 

2.627 

2.581 

2.633 

2.671 

2.650 

2.717 

2.614 

2.524 

2.741 

2.619 

2.755 

2.846 

2.575 

2.561 

2.593 

2,639 

2.611 

2.696 

2.606 

2.616 

2.748 

2.518 

2.726 

2.875 

2.585 

2,547 

2.606 


Tensile 

strength 

(pounds 

per 

square inch) 

29,250 

27,992 

31,852 

27,646 

31,698 

30,844 

31,988 

36,640 

41.578 
30,496 
29,668 
32,622 
32,822 
30,380 
38,580 
28,202 
29,190 
85,636 
34,332 
34,750 

40.578 
28,900 

34,648 
31,244 
33,802 
34,850 
36,690 
32,344 
34,440 

34,650 


Hardness 

(Rockwell 

E) 


71.3 

52.7 

76.5 

63.7 
69.2 

69.2 

61.4 

83.7 

94.7 

70.2 

80.4 

76.7 
82.9 

55.0 

83.2 

62.6 

78.0 

84.6 

64.0 

75.3 

84.8 

49.4 

74.2 

59.8 

75.2 

57.7 

79.3 
67.6 

77.0 

74.8 


Density 
(grama 
per cubic 
centimeter) 


2.648 

2.400 

2.692 

2.669 

2.628 

2.696 

2.648 

2.775 
2.874 

2.700 
2.583 
2.668 
2.679 

2.609 
2.721 
2.678 

2.610 
2.728 
2.709 
2.880 
2.949 
2.669 
2.624 
2.705 
2.736 

2.701 

2.776 
2.754 
2.660 
2.819 
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Is the multiple regression of tensile strength (7) on hardness (U) and 
density (TO significant? 


Source of variation 

Sum of squares 

Degrees of 
freedom 

Mean square 

Regression 

Residual 

524,681,187 

417,700,935 

2 

57 

262,340,593.5 

7,328,086 

Total 

942,274,585 

59 



F — —r - 35.80. From Table VIII, for 2 and 57 degrees 
7,328,086 

of freedom, we need F — 4.98 for highly significant regression. Hence 
the equation found by Shewhart 

Y r « 150.988 !7 + 15310.357 

describes the relationship between tensile strength, hardness, and density. 

In the preceding chapter, it was shown that a linear relationship 
between the life of light bulbs and a certain test of filament wire was 
not statistically significant. A second type of test was made on the 
wire. Jennett and Dudding (24) report the following results, the first 
two columns showing the data already considered on page 103. 


Life of bulbs 
Y 

1605 

1120 

1320 

1225 

1055 

1390 

1385 

1700 

2070 

1395 

1105 


Test of wise 
V 
276 
293 
288 
315 

305 
315 

306 
286 
289 
296 
335 


Test of wire 
V 

14.2 

15.6 
16.1 

15.2 

14.6 

21.4 

19.4 
18.9 

18.5 

20.8 


Would a multiple relation of the form 

Y r - a + bU + cV 

succeed where the linear relation between Y and U failed? 
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We find 

a = 4,336.29 
b = -12.69 



C = 

49.80 


Source of variation 

Sum of squares 

Degrees of 
freedom 

Mean square 

Regression 

388,795.13 

2 


Residual 

481,227.37 

7 


Total 

870,022.50 

9 



F = zz~ = 2.83. From Table VIII, for 2 and 7 degrees of 

0O,/40./ ( 

freedom, a value of F = 4.74 is required for significance at the 0.05 
level. The regression is not statistically significant even when two 
independent variables are included. 

NOTES 

3.18 Least squares and multiple regression. In fitting a regression 
plane 

Y r = a + bU + cV 
to observed data by the method of least squares, 

£(T - F r ) 2 

is required to be minimum, Y being the observed values of the dependent 
variable. Write 

<P= Z(F - a- bU ~cV)* 

For <p minimum 

“ = -2E(F - a - bU - cV) = 0 
oa 

[6J ^ = -2 £(F - a ~bU~ cV)U = 0 

—■ = -2£(F -a-bU - cV)V = 0 
oc 
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Furthermore d V/ da 2 , d 2 ip/db 2 , d 2 <p/ dc 2 are positive. Rearranging [6] we have 
the normal equations 


ZY = m + bZU + cZV 

ZYU = aZU + bZU 2 + cZVU 

ZYV = aZV + bZVU + cZV 2 

which may be solved simultaneously for a, b, and c. 

3.19 Computation of the sums of squares for multiple regression analysis. 
In order to reduce the labor involved in computing sums of squares, the 
following were suggested: 

Z(Y - Y) 2 = ZY 2 -n? 2 

Z(Y r - Y) 2 = £(7, - Y)(Y r -?) = Z(Yr - Y)(Y r ) 

= Z(a + bU + cV — ?)(Y r ) 

= olr r + bZVYr + cZVYr - ?ZY r 

= aZY + bZUY + cZVY - (~~J 

since 

ZYr = E(a + bU + cV) = na + &£[/ + cZV = ZY 
and similarly for ZUY r and ZVY r . 

Z(Y - Y r ) 2 = Z(Y - Y r )(Y - Y T ) 

= Z(Y - Y r )Y - Z(Y - Y r )Y r 
= Z(Y - Y r )Y - Z(Y - Y r )(a + bU + cV) 

From the three normal equations, we have 

Z(Y - Y r ) = 0, Z(Y - Y,)U = 0, Z(Y - Y r )V = 0 


£(F - Y T ) 2 = Z(Y - Yr)Y 

- Z(Y — a — bU — cV)Y 
= ZY 2 -aZY- bZUY - cZVY 


Hence 
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3.20 Analysis of covariance. Furry (19) gives the following data on 
the breaking strength anUtHcImess of starch films. 

Breaking strength and thickness of starch films* 
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Corn starch 

Potato starch 

1 


731.0 

i 


983.3 

2 

7.3 

710.0 

2 


958.8 

3 

7.2 

604.7 

3 


747.8 

4 

6.1 

508.8 

4 

12.2 

866.0 

5 

6.4 


5 

11.6 

810.8 

6 

6.4 


6 

9.7 

950.0 

7 

6.9 


7 

10.8 

1,282.0 

8 

5.8 

335.6 

8 

10.1 

1,233.8 

9 

5.3 

306.4 

9 

12.7 

1,660.0 

10 

6.7 

426.0 

10 

9.8 

746.0 

11 

5.8 

382.5 

11 

10.0 

650.0 

12 

5.7 


12 

13.8 

992.5 

13 

6.1 

436 7 

13 

13.3 

896.7 

14 

6.2 

333.3 

14 

12.4 

873.9 

15 

6.3 

382.3 

15 

12.2 

924.4 

16 

; 6.0 

397.7 

16 

14.1 

1,050.0 

17 

6.8 

619.1 

17 

13.7 

973.3 

18 

7.9 

857.3 

.... 



19 

7.2 

592.5 


.... 



Analysis of variance of breaking strengths 


Source of variation 

Sum of squares 

Degrees of freedom 

Mean square 

Among starches 

5,307,433.08 

6 

884,572.18 

Within starches 

1,987,918.13 

87 

22,849.63 

Total 

7,295,351.21 

93 

... 


The differences in breaking strengths from starch to starch are highly 
significant. But examination of the data indicates that at least some of this 
apparent significance is due merely to differences in the thickness of the 
starch film and not to any chemical superiority of certain starches over others. 
To determine the effect of thickness on strength, the relationships between the 
two must first be measured for each starch; for our purposes the best measure 
is given by the regression coefficient b (the slope of the regression line) of break- 
ing strength on thickness. 

r TAX - Y)(x - x) Erx - nfx 

° £(X-X ) 2 X > 2 

where "7 represents a breaking strength value and X represents the corre- 
sponding value of thickness. To illustrate, we have for sweet-potato starch 

26,658.76 - 26,022.60 
b “ 339.48 - 334.89 


138.60 
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We form the following table.* 


Starch 

z/ 

Zz/z 

i > 2 

b 


Wheat 

254,104.85 

2,310.53 

25.96 

89.00 

48,459.67 

Dasheen 

13,215.27 

156.39 

2.65 

59.02 

3,985.90 

Com 

447,055.68 

1,866.60 


188.93 

94,404.31 

Rice 

30,993.44 

417.61 

9.15 

45.64 

11,933.54 

Sweet potato 

105,772.34 

636.16 

4.59 

138.60 

17,602.50 

Canna 

232,013.75 

2,763.55 

46.61 

59.29 

68,160.32 

Potato 

904,762.80 

1,192 81 

36.66 

32.54 

865,952.22 

Total 

1,987,918.13 

9,343.65 

135.50 


1,110,498.46 


We should like to eliminate from the variation in breaking strength that 
part attributable to variation in thickness; to do so it would be convenient to 
use an average within-starch regression coefficient such as would be given by 
b = 9,343.65/135.50. This is permissible only if the differences among the 
seven regression coefficients are not statistically significant. To test the 
latter, we have 


Source of variation 

Sum of 
squares 

Degrees of 
freedom 

Mean 

square 

Deviation of within-starch regression lines 
from average within-starch regression 
line 

Z(F,~ F?) 2 

233,111.22 

6 

38,851.87 

Deviation of observations from within- 
starch regression (error) 

1,110,498.46 

80 

13,881.23 

Z(T-Fr) 2 




Deviation of observations from average 
within-starch regression line 

1,343,609.68 

86 


Z(F — Y-r) 2 





The calculation of the above sums of squares can be carried out in the 
following way: 

Z(F - F ?) 2 = S’/ - ~T- 

ZJ& 


The value of £ V 2 f° r 'within starches for the entire table is 1,987,918.13. 
Also 


(I» 2 _ (9,343.65) 2 
2> 2 135.50 


644,308.45 
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or 

E(F - YrY - 1,343,609.68 


In ]£( F — F r ) 2 , deviations are measured from the within-starch regression 
lines, i.e., 

£(F - F r ) 2 - 1,110,498.46 


By subtraction 


L(Fr ~ Fr) 2 = 233,111.22 


The allocation of degrees of freedom is easily explained. For the total sum of 
squares, £(F ~ F?) 2 , there are the 87 degrees of freedom allocated to within- 
starch variation in the original analysis of variance less the 1 degree of freedom 
lost by the fact that the deviations are taken about the over-all within- 
regression line. Second, from the original 87 degrees for within-starch varia- 
tion we must subtract 7 degrees of freedom attributable to the 7 regression 
lines, leaving 80 degrees of freedom for the term £(F — F r ) 2 . By subtrac- 
tion, there remain 6 degrees of freedom for — F?) 2 . 

Applying the F test, we find 


F = 


aggff - 2.80 

13,881.23 


which for 6 and 80 degrees of freedom is significant at the 5 per cent level but 
not significant at the 1 per cent level. It is up to the analyst to decide whether 
or not he will continue. We shall consider the regression coefficients to be not 
significantly different; they are presumed to be replaced by 


9,343.65 

135.50 


68.96 


We are now able to calculate adjusted breaking strength means for each 
starch, each strength mean being corrected by elimination of the effect of 
thickness. The figure below illustrates the situation. Clearly most of the 
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variation in breaking strength is explained by variation in thickness. We 
have, for any mean F a 

Variation 

Variation 

around 

due to 

regression 

regression 


P. * P + (P. - 7?) + (F? - P) 

We wish to eliminate the effect of the last term on Y g . Hence we want 
corrected values of Y given by 

Corrected f 8 = Y + (P a - P?) 

or since 

Fr - P = bx 

Corrected F a = Y s — bx 
The following tabular form is convenient. 


Starch 

Original mean 
breaking 
strength 

Y g 

Mean 

thickness 

X.-X 

bx — 

H%, - X) 

Corrected 

mean 

breaking 

strength 

Wheat 

308.7 

4.90 

— 3.00 

-206.9 

515.6 

Dasheen 

412.7 

6.33 

-1.57 

-108.3 

521.0 

Com 

482.8 

6.53 

-1.37 

- 94.5 

577.3 

Rice 

539.2 

7.74 

-0.16 

- 11.0 

550.2 

Sweet potato 

711.0 

9.15 

1.25 

86.2 

624.8 

Canna 

795.3 

10.19 

2.29 

157.9 

637.4 

Potato 

976.5 

11.96 

4.06 

280.0 

696.5 


We can now judge the effect of the independent variable, thickness. The 
original analysis of variance is replaced by the following table: 


Source of variation 

Sum of squares 

Degrees of freedom 

Mean square 

Among starches 

99,947.77 

6 

16,657.96 

Within starches 

1,343,609.68 

86 

15,623.37 

Total 

1,443,557.45 

92 ; 



The total sum of squares 1,443,557.45 differs from the previous total 
7,295*351.21, for the latter represented variation about the grand mean F, i.e., 

£(F - F) 2 

whereas the present total sum of squares represents variation of the observa- 
tions about a regression line fitting the 94 points, i.e., 

£(F - Fa) 2 
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The same is true for the present wi thin-star ch sum of squares. The among- 
starch sum of squares is obtained by subtraction. The necessity of computing 
these terms makes it desirable to set up at the outset a table of the form shown 
herewith. 


Source of variation 

X> 2 

T.yx 

2> 2 

Among starches 

5,307,433.09 

56,268.32 

600.16 

Within starches 

1,987,918.13 

9,343.65 

135.50 

Total 

7,295,351.22 

65,611.97 

735.66 


Using the equation given on page 122, we have 

7,295,351.22 - = 1,443,557.45 

1,987,918.13 - -f„ 4 f. 6 n 5)2 = 1,343,609.68 
135.50 


and the value 99,947.77 is obtained by subtraction. The degrees of freedom 
differ from the original table only in that there are 86 instead of 87 degrees of 
freedom for the within-starch (error) term, for now error is measured by 
variation about regression rather than variation about starch means, and 1 
additional degree of freedom is lost by the calculation of the regression coeffi- 
cient from the data. 

An F test yields 


F = 


16,657.96 

15,623.37 


« 1.066 


which, for 6 and 86 degrees of freedom, is not significant. The differences in 
breaking strengths are attributable to differences in thicknesses and not to 
kinds of starches. Judgment on the basis of the original analysis of variance 
might have been misleading; the inclusion of the independent variable, thick- 
ness, was essential if proper conclusions were to be reached. 









CHAPTER IV 


SYSTEMATIC QUALITY CONTROL 

4.1 Introduction. The preceding chapters considered the design 
and analysis of industrial experiments which aim to identify the factors 
responsible for variable quality. The present chapter describes the con- 
tribution to this objective of a routinized system of recording quality 
data. 

In this chapter our objective is 7 essentially, to judge the quality of 
current output against standards. The standards may be set by 
technical commissions, by government, or they may represent the 
quality of the product during the past. If current output, as known 
from current samples of information, departs from the standard by an 
amount which is statistically significant, an economic loss may be 
involved. Output, the quality of which is significantly higher than 
intended, implies wastage of labor and materials; the dis-economies of 
lower than standard quality are equally obvious. If possible, the re- 
sponsible factor or factors should be immediately identified and removed. 

Standards formed from past experience may be based on the quality 
records of a fixed period of time during the past or on a period which 
changes as time goes on in order to incorporate the records of the more 
recent past. There are also variations of these schemes. We shall 
illustrate only the case in which the time period is increasing. That is, 
after the quality of the output of say the 50th day is judged against the 
standard of the previous 49 days, the data of the 50th day is added to the 
previous population, and the quality of the 51st day is judged by com- 
parison with the standards based on 50 days of data. The method of 
handling fixed or other varieties of shifting standards involves only 
minor changes in the following discussion and the reader can supply them 
for himself. 

Standards based on accumulated data are valid only if those data are 
homogeneous. Thus a standard in the form of a mean of say 20 weeks' 
data has little sense if the 20 weekly means differ significantly among 
themselves. We shall test each population for such homogeneity. In 
order to preserve the homogeneity of shifting populations it has occa- 
sionally been the practice not to add to the population without adjust- 
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ment any data on current output which departs significantly from the 
standard. This practice has no statistical justification and is not rec- 
ommended. 

Finally the populations must be large relative to the size of the current . 
samples. This is clearly sensible from a practical point of view; more- 
over it is important statistically Jf we are to assume exact knowledge of 
such population parameters as X 1 and cr, which are or form the basis of 
industrial standards. 

4.2 Population: formation and homogeneity. Supplement B of the 
American Society for Testing Materials' “ Manual on Presentation of 
Data ” (1) gives the following information on an operating characteristic: 


Sample number 

Sample size n 

Mean quality X 

Standard 
deviation s 

1 

50 

35.7 

5 35 

2 

50 

34.6 

5.03 

3 

50 

32.6 

3.43 

4 

50 

35.3 

4.55 

5 

50 

33.4 

4.10 

6 

50 

35.2 

4.30 

7 

50 

33.3 

5.18 

8 

50 

33.9 

5.30 

9 

50 

32.3 

3.09 

10 

50 

33.7 

3.67 


From this information we want to estimate the mean X f and the 
standard deviation cr of the population of 500 observations formed by 
combining these 10 samples. 

For k samples of size n h • • • , ft* with respective means X h • • *, X& 
and respective variances sf, * * •, s% we know that 

X = 

IX 

and 

IX - k 

In the present example X r = 34.0 

and f = 4.51 

Is this population homogeneous in its mean? Assuming normality, 
approximately 90 per cent of the sample means (i.e., nine means) should 
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Range 

NlOWJHOMOOOiiOHQOeONHONWOWWWO 

H H H H r-i H H H THrHrHrHi— Ir-tTHT-HrH 

Estimated 
standard 
deviation 
to date <r 

HN^OOHNNW^CONMH«N©WOOHOJQ«D 

00u3«HC50iN«0«C?DNC0C0cD»0iOO^«0^®iO 

rH^TjirjicOCOCOCOCOCOCOCOCOCOCOirOCOCOCOCQCOOO 

Number of 
observa- 
tions to 
date, less 
number of 
samples 
& 

MCOfflWWOOHrHNQCCCOONlOOOH^NOW© 

HlMKHOONOOHCO^tfJOooaONCO^^NoO 

hhhhhhhhNWNWNNN 

£(x-x ) 2 

for all 
data 
to date 

300.930 

532.090 

733.810 

907.240 

993.240 
1,197.240 
1,262.170 
1,371.026 
1,548.246 
1,765.176 
2,034.606 
2,051.462 
2,198.682 
2,394.902 
2,479.832 
2,640.762 
2,889.118 

2.996.838 

3.219.838 
3,358.194 
3,543.694 
3,620.914 

s 

Sample 
standard 
deviation = 

i 

k 

s 

< 

e 

ooooio^oohmoocohinn^mwnohcOco 
* ji^cocoe<ieoc$ei cow^hcoco«w^nwwwn 

£(x -£) 2 

for each 
sample 

300.930 
231.160 

201.720 

173.430 
86.000 

204.000 

64.930 
108.856 

177.220 

216.930 

269.430 
16.856 

147.220 

196.220 

84.930 

160.930 

248.356 

107.720 

223.000 

138.356 
185.500 

77.220 

Mean 
to date 
X' 

537.36 

540.25 

539.45 

538.45 

538.36 

538.13 
537.67 
537.30 

537.45 
538.21 
538.60 
538.77 
538.90 
538.88 
539.01 
539.20 

539.26 
539.19 

539.13 
539.06 
538.99 
538.87 

Total of 
observed 
data on 
Btu con- 
tent to 
date 

CONNfOlCMNOOOO^^OSOCOOWNfflOOCON 
(M(N^)IOOOOC3 NHIO^hNMOioO^NOCO(£)N 
!0HtDHCpNC0HihC0C5l£5Ot0 > HJhC000'^05^0i 
t>tS tN* O ' lO C<T O'jCioC^f OC^iOCOOOOtCCOOOOXO 

HNCOCO^lOOONCOOOOHNtMCO^ioiOcO 

t— ItHt— (TH r-Hr-liHr- I t— f 

Total 
observa- 
tions to 
date 
£» 

rtHHHrtrtHN«NN(N(NWW 

Sample 
mean JT 

cD^OWOOMH^NCOHCO^eocOH^OC&OO 

COHOO^OOOINOO^NCOCOQQWHONISCO 

N CO N u: 0O N rH 00 ic N O O 00 O H Q 00 00 N N (£) 

io io lo L 15 io o u: lo io ic lo o io io K) io ii: irj lo 

Total of 
observed 
data on 
Btu con- 
tent 

M^oo(Sooc»coiHH^oiOHCONeoT}tNOMu5o 

NOWQCOHOpoO^COQNCOTilNoClCOCOMf^NO 

»o io io io xo 

t^*r— t-r—r— r— ' t^T t>T j>T j>T t>T iC 

Sample 
size n 

r-« r-< i— 1 r* Ir-tTHr-lTHTHT-HTHT-Ht— IHHHHrtHHHH 

Sample 

number 

rt<MW^lO«DNQO®Ot^W05iJ<»O©t>OO®QnN 
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fall within X f =fc 1.65<7^ where cr s = 3/V^n. 

4.51 
~ *^50 


We have 
= 0.638 


Only five means fall within X' ± 1.6 Sag. The population cannot be 
said to be homogeneous in its mean. 

Possibly the lack of homogeneity of the population is due to significant 
differences among the 10 sample standard deviations. The distribution 
of s for random samples from a normal population was given on page 46. 
For large samples, say n > 30, it is easy to show that this distribution is 
approximately normal with standard error cr s equal to a/V2n. We have 


_ 4.51 

~~ ^I5o 


= 0.451 


Five values of s fall outside the range a zfc 1.65<r s ; the sample standard 
deviations differ significantly among themselves. The parameters 
of this population (. X 7 and <j) could not be effectively used as standards 
with which to compare current quality. 

4.3 Example involving X. The data in the first three columns of 
the preceding table have been reported by Pettebone and Young (32). 
They cover 50 consecutive samples each of 14 observations; the quality 
characteristic is the Btu value of a mixed fuel gas. 

The functions of the various columns will be discussed presently. The 
tabular form used was originally given by Dudding and Baker (10). 

In order to permit the reader to check all entries in the first three 
rows of the preceding table, the individual observations of the first 
three samples are now given. 


Btu value 


Sample 1 

Sample 2 

Sample 3 

533 

546 

a 541 

537 

540 

534 

535 

542 

528 

540 

550 

538 

542 

541 

542 

547 

547 

540 

543 

546 

539 

536 

539 

538 

531 

542 

543 

530 

535 

533 

536 

542 

538 

534 

542 

537 

538 

544 

540 

541 

549 

539 
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4.4 The range. In the last column the range, which is the difference 
between the largest and smallest observations, is recorded. The princi- 
pal merit of the range lies in the ease with which it is computed, and its 
utility arises from the fact that for a normal population the mean range 
over k samples (k large) bears a fixed relationship to the more common 
measure of variation, the standard deviation a of the population. For 
k = 50, the mean range will be found to be 10.68. For sample size n = 
14, we find from Table VI 


or 


10.68 

a 


3.40676 


a = 3.135 


which is in good agreement with £ = 3.26 found by the more efficient 
method that is illustrated on page 127. For small k, say less than 10, 
or large n, say more than 15, the mean range method of estimating a is 
unreliable. In our example, the agreement between the two methods of 
estimating <r happens to be good even for small k. Thus for k = 2, the 
two methods of estimating yield 4.52 and 4.70; for k = 5, they yield 
3.19 and 3.93. 

The columns of the large table contain the information needed for a 
simple quality control record. Thus, at the end of the 20th week, the 
mean and standard deviation of the accumulated population are 

X' - 539.06 


We wish to determine whether or not the mean of the sample of the 
21st week (based on n observations) falls within d=2 c% of X f ; where 


*2 = 


a _ 3.59 

Vn Vli 


= 0.96 


It does not; hence the quality of the output of the 21st week does not 
conform to the standard (Z / = 539.06). If records of influential 
factors, such as kind of coal burned, are kept simultaneously, the cause 
of lack of control can often be immediately spotted and corrected. 

The steps of this procedure is systematized in the following table. 
The calculations begin with the 20th sample so that the beginning popu- 
lation will be 20 times as large as the first sample to be judged. 



132 


INDUSTRIAL STATISTICS 



r 

a 

n 

size of 
following 
sample 

< 7 * 

= t= 
V n 

X’ - 2<T2 

X' 4- 2erj* 

Xof 

following 

sample 

Under 

con- 

trol 

20 

539.06 

3.59 

14 

0.96 


540.98 

537.50 

Yes. 

21 

538.99 


14 

0.96 

537.07 

540.91 

536.36 

No. 

22 

538.87 

3.56 

14 

0.95 

536 97 

540.77 

538.07 

Yes. 

23 

538.84 

3.54 

14 

0.95 

536.94 

540.74 

538.21 

Yes. 

24 

538.81 

3.54 

14 

0.95 

536.91 

540.71 

536.93 

Yes. 

25 

538.73 

3.52 

14 

0.94 

536.85 

540.61 

538.00 

Yes. 

26 

538.71 

3.48 

*14 

0.93 

536.85 

540.57 

536.14 

No. 

27 

538.61 

3.47 

14 

0.93 

536.75 

540.47 

535.29 

No. 

28 

538.49 

3.43 

14 

0.92 

536.65 

540.33 

534.43 

No. 

29 

538.35 

3.39 

14 

0.91 

536.53 

540.17 

536.07 

No. 

30 

538.28 

3.36 

14 

0.90 

536.48 

540.08 


Yes. 

31 

538.26 

3.35 

14 

0.90 

536.46 

540.06 


No. 

32 

538.19 

3.34 

14 

0.89 

536.41 

539.97 

537.79 

Yes. 

33 

538.18 

3.32 

14 

0.89 

536.40 

539.96 


No. 

34 

538.07 

3.29 

14 

0.88 

536.31 

539.83 

536.93 

Yes. 

35 


3.28 

14 

0 88 

536.27 

539.79 

537.14 

Yes. 

36 

538.01 

3.24 

14 

0.87 

536.27 

539.75 

537.29 

Yes. 

37 



14 

0.87 

536.25 

539.73 

536.14 

No. 

38 



14 

0.87 

536.20 

539.68 


Yes. 

39 


3.25 

14 

0.87 


539.66 


Yes. 

40 

537.91 

3.24 

14 

X 0.87 

536.17 

539.65 


Yes. 

41 

537.94 

3.22 

14 

0.86 

536.22 

539.66 


No. 

42 

537.85 


14 

0.86 

536.13 

539.57 

536.43 

Yes. 

43 

537.82 


14 

0.86 

536.10 

539.54 

537.57 

Yes. 

44 

537.81 


14 

0.86 

536.09 

539.53 

536.21 

Yes. 

45 

537.77 


14 

0.86 

536.05 

539.49 

537.93 

Yes. 

46 



14 

0.86 

536.06 

539.50 

534.93 

No. 

47 



14 

0.86 

536.00 

539.44 

534.43 

No. 

48 


3.24 

14 

0.87 

535.91 

539.39 


Yes. 

49 


3.25 

14 

0.87 

535.89 

539.37 


Yes. 

50 


3.26 

*• 

... 






The judgments in the last column are valid only if the population 
against -which the sample is being tested is homogeneous. At each stage 
of the process the homogeneity of the population should be tested. 
Thus at the end of 20 weeks 

a 3.59 

= —rz. s -^s= = 0.96 

V» V 14 
T = 539.06 
T - 2<r x - 537.14 
2 ' + 2a % - 540.98 
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In 8 out of 20 samples constituting this population (which is far greater 
than the 5 per cent of 20— or 1 sample — which could be attributed to 
chance) the sample mean X fell outside the limits 

X* i 2<r^ 


This lack of control is possibly attributable to significant differences 
among the sample standard deviations s,. Very roughly, for n as small 
as 14, we have 


= 


3.59 


V2n V28 


- 0.679 


Two of 20 values of Si fall outside a db 2 <r 5 , only 1 above the allowable 
limit (1 in 20). 

We conclude that the population formed of the first 20 samples is 
clearly not homogeneous and the judgment of the mean quality of the 
21st sample given in the preceding table cannot properly be made. More 
often than not, in industrial practice, population homogeneity will be 
achieved only after months of effort, and a quality control program of 
the kind suggested in this chapter will not be immediately possible. 
In such cases the statistician can best serve by assisting in the design 
and analysis of experiments which aim to identify the causes of non- 
homogeneity. 

4.5 Example involving fraction defective p. A similar procedure 
is available wherever quality must be recorded simply as acceptable or 
not. The first three columns of the following table exhibit data, recorded 
by Shoumatoff (37), covering defects found in the primary inspection of 
standard radio tubes. 

The most important population parameters are the fraction defective 
p and the standard deviation cr p , and these will constitute the standards 
in the quality control program. Later in this chapter it will be shown 
that 



p is the fraction defective in a sample of size n, i.e., pn is the number 
of defects in a sample, p is the population fraction defective, and 
q=l~p, q^l-p. 

The various columns of the following table show the necessary 
sample statistics together with constantly revised estimates of the 
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population parameters. Inasmuch as n differs from sample to sample, 

pq 

we record pq rather than — . 


Date 

Number 
of tubes 
inspected 
n 

Number 
of tubes 
rejected 
pn 

Fraction 
defective 
in sample 

(in %) 

V 

Total 
number 
of tubes 
inspected 
to date 

i> 

Total 
defects 
to date 

V 

to date 

(in %) 
T.pn 

~ 2> 

i 

« I 

1 

16,484 


12.2 

16,484 

2,008 

12.2 

1071 

2 

mzwmm 

2,719 


41,192 

4,727 

11.5 

1018 

3 

27,599 

2,691 

9.8 

68,791 

7,418 

10.8 

963 

4 

28,545 

2,699 

9.5 

97,336 

10,117 

10.4 

932 

5 


3,377 


128,866 

13,494 

10.5 

940 

6 

8,588 

1 iff 

12.8 

137,454 

14,594 

10.6 

948 

7 

19,574 

1,478 

7.6 

157,028 

16,072 

10.2 

916 

S 

28,644 


7.8 

185,672 

18,242 

9.8 

884 

9 

29,256 

2,214 

7.6 

214,928 

20,456 

9.5 

860 

10 

32,605 

2,540 

7.8 

247,533 

22,996 

9.3 

844 

11 

9,314 

750 

8.1 

256,847 

23,746 

9.2 

835 

12 

16,163 

1,108 

6.9 

273,010 

24,854 

9.1 

827 

13 

25,601 

1,945 

7.6 

298,611 

26,799 

9.0 

819 

14 

22,170 


7.6 

320,781 

28,489 

8.9 

811 

15 

26,462 

2,162 

8.2 

347,243 

30,651 

8.8 

803 

16 

7,955 

671 

8.4 

355,198 

31,322 

8.8 

803 

17 

11,908 


6.6 

367,106 

32,112 

8.7 

794 

18 

23,162 

1,641 

7.1 

390,268 

33,753 

8.6 

786 

19 

24,154 


7.8 

414,422 

35,643 

8.6 

786 

20 

25,287 

1,911 

7.6 

439,709 

37,554 

8.5 

778 

21 

4,955 

517 

10.4 

444,664 

38,071 

8.6 

786 

22 

20,095 

1,525 

7.6 

464,759 

39,596 

8.5 

778 


At the end of the 15th day, we have, for the accumulated population 
to that date 

p ~ 8.8 per cent 
pq — 803 

Does the mean quality of the output of the 16th day conform to the 
standard set from this short period? We have 



The percentage defective of the 16th week is 8.4, which lies between 
p dfc 2<r r The advance in quality of 0.4 per cent from the standard p 
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is reasonably attributable to chance, and no inquiry is warranted. 
This is not true of the remaining days, all of which show significant 
departures from the standard. 

The final steps of the procedure are shown in the following table. 


Date 

V 

(per 

cent) 

pq 

n 

number of 
tubes in 
following 
sample 


V ~ 

V -f~ 2(r p 

V 

fraction 
defective 
in follow- 
ing sample 

Under 

con- 

trol 

15 

8.8 


7,955 

0.32 

8.2 

9.4 

8.4 

Yes. 

16 

8.8 

803 


0.26 

8.3 

9.3 

6.6 

No. 

17 

8.7 

794 

23,162 

0.19 

8.3 

9.1 

7.1 

No. 

18 

8.6 

786 

24,154 

0.18 

8.2 

9.0 

9.8 

No. 

19 

8.6 

786 

25,287 

0.18 

8.2 

9.0 

7.6 

No. 

20 

8.5 

778 

4,955 

0.40 

7.7 

9.3 

10.4 

No. 

21 

8.6 

786 


0.20 

8.2 

9.0 

7.6 

No. 

22 

8.5 

778 


.... 



.... 



As in the previous example, the homogeneity of the current popu- 
lation of sample fraction defectives must be examined. This is some- 
what laborious for variable n. To test the homogeneity of the pop- 


Date 


p 2cr p 

P -f- 2 <Tp 

1 

0.22 

8.4 

9.2 

2 

0.18 

8.4 

9.2 

3 

0.17 

8.5 

9.1 

4 

0.17 

8.5 

9.1 

5 

0.16 

8.5 

9.1 

6 

0.31 

8.2 

9.4 

7 

0.20 

8.4 

9.2 

8 

0.17 

8.5 

9.1 

9 

0.17 

8.5 

9.1 

10 

0.16 

8.5 

9.1 

11 

0.29 

8.2 

9.4 

12 

0.22 

8.4 

9.2 

13 

0.18 

8.4 

9.2 

14 

0.19 

S.4 

9.2 

15 

0.17 

8.5 

9.1 


ulation accumulated to the fifteenth day, w r e have p = 8.8 per cent, 
q = 91.2 per cent and 15 values of v p , depending on n. All 15 sample 
percentage defectives fall outside these limits. There is, therefore, no 
evidence of control in the production of these tubes during this (far 
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too brief) 15-day period, and the judgment previously passed on the 
quality of the tubes of the 16th day must be rescinded. Allocable causes 
of variability are present and before any effective quality control pro- 
gram can be set up, as many as possible of these causes must be dis- 
covered and removed. 

A similar test of homogeneity must be carried out on each successive 
population. 

NOTES 


4.6 Probability of t defects. Let the probability of a defect be p. Let 
q be the probability that the piece is good, p + q = 1. If a random sample 
of n pieces is taken, each selection of a piece being independent of all others, 
what is the probability Pt of obtaining exactly t defective pieces and n — t 
good pieces? 

The first t pieces may be defective and the remainder good. This probabil- 
ity is But t defective pieces may be obtained in as many ways as one 

can form combinations of n pieces taken t at a time, namely, C”, where 


t\(n — ^)! 


Hence the answer is 

Pt = 

Now Clq^p 1 is the general expression for the terms of the expansion of 

(q 4 - V) n - h 

1 « (q + p) n = q n + nq n ~ l p + ^ "* 

+ C?g n ” £ p £ + * * ■ + nqp n ~ l + p n 


Hence the successive terms of the above expression give the probability of 
0, 1, 2, * * * , n defective pieces. Eor example, for p = q = %, an dn — 8, 
the probabilities are shown in the table below. 


t 

c? 

s °-y 

II 

0 

1 

0,2325680 

0.233 

1 

8 

0.0465136 

0.372 

2 

28 

0.0093027 

0,261 

3 

56 

0.0018605 

0.104 

4 

70 

0.0003721 

0.026 

5 

56 

0.0000744 

0.004 

6 

28 

0,0000149 

0.000 

7 

8 

0.0000030 

0.000 

3 

1 

0,0000006 

0,000 




1.000 
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The accompanying graph illustrates these results. 



4.7 Mean and variance of the fraction defective. We shall calculate 
the mean and variance of such a distribution of probabilities. 

Consider an 72-fold experiment (n - 8 , above). Repeat it N times. Then, 
for the first moment 0 Mi around the origin (oilfi is the arithmetic mean), we 
have 

nr t rwx /oXO+/iXl + ,,, +/nXft , 

0 M 1 = mean number of defects = where 

N 

fo is the frequency of zero defects, fi the frequency of 1 defect, etc. 

Nq” X 0 + Nnq*- 1 pXl + N " - f — g—y X 2 + ■ • • + Np n n 

£i\ 


— np(q + p) n ~ l = np 


To find the variance we first compute the second moment 0 M 2 about zero. 
The method is due to Bowley (3). 

u fo X 0 2 + /i X l 2 + * ’ • + fn X n 2 


= 'Et 2 Cfq n ^p 

0 




- ZW -1) + <]X 


n(n — l)(n — 2 ) * * • (n — t + 1) 


-tpt 


- rr- + 

» n(n — l)p 2 (g + p) n “ 2 + np(q + p) n ~ l 
~ n?p 2 + npq 
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But 


0M2 = c 2 + (mean) 2 * 

= <r 2 + (np ) 2 

a 2 = 0M2 — (np ) 2 = n 2 p 2 + npg — n 2 p 2 = npg> 


In a similar way 


and 


Vft - 


9 - p 
^ pqn 


P 2 = 3 + 


1 — 6pg 
pqn 


Note that as n increases, — » 0 and ^ 2 —> 3; the distribution of bino- 

mial probabilities approaches normality even for p 9 ^ g. 

To illustrate some of the foregoing results: if the probability of a defective 
piece in a population is p = % and if we draw at random n = 1000 pieces, the 
m ean (expected) number of defects is pn = 167 and the standard deviation is 
^ pqn — 11.8. As the distribution of frequencies of defects is approxi- 
mately normal for n = 1000, we conclude (from Table IY) that in the absence 
of all u causes ” except that of random sampling variation, about 95 per cent 
of such 1000-observation experiments should have & frequency of defects within 


that is 



[11 


167 db 23.6 


It is, however, generally more useful to record limits on proportions (proba- 
bilities) than on frequencies. Each probability value is one nth of the corre- 
sponding frequency value; we have 

mean fraction defective = — = p 

n 

standard deviation of p = <r p = 



Thus in about 95 cases in 100 the proportion of defects in the 1000-observation 
experiments should fall between 

p ± 2 lm 

Y n 

* For any variate <p 

£ (#> — 0) 2 “ 5 + 5 — 0) 3 « H ( 5 *~ 0) 2 

" T* 1 

n n n n 

the cross-product being zero. In other symbols qM 2 « a 2 -f £ 2 . 
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or 

I ± 0.00236 

which is, of course, equivalent to [1], 

4.8 Limits for control. In setting limits of ±2a^ or ±2<r p we expect, 
even in the ideal case of absence of allocable causes, to find the mean or 
fraction defective outside these limits in about 5 per cent of the random sam- 
ples. Thus, as we wish to investigate the reasons for every lapse in quality 
(outside ±2cr g or ^z2a p ) w T e shall 5 per cent of the time find no allocable 
cause whatever; for example, the sample fraction defective, though outside 
±2cr p of p actually did occur by chance. If broader limits, say ±3<r p , are set, 
the fraction defective will, in the absence of all but chance forces, fall outside 
these limits in only about 3 of each 1000 random samples. Although we shall 
then be less frequently searching for “ causes ” that do not exist, we shall 
more frequently not be searching for “ causes ” that may exist. Thus with 
rkScrp a deviation from p so large that it could be expected to happen only 
once in 100 samples would not be 
considered to be evidence of lack 
of control. 

This may be stated somewhat 
differently: if the limits are set 
at zk3a p and if the true value of 
the percentage defective p r for 
current output (from which the cur- 
rent sample is drawn) is far off the 
standard p, say at p + 3<j P; then 
as many as half of all random 
samples drawn from current out- 
put would indicate control, i.e., their percentage defectives would fall 
within db3o> With .the same out-of-control value for the lot and limits 
of dz2cr P) only 16 per cent of the samples will fall within the control limits. 

Limits cannot be set with security until one has accumulated experience 
as to what limits are economic. It may be suggested that results between =fc 2 cr p 
and dz3cr P will bear considerable investigation, whereas results exceeding 
±3 cr p should always receive extensive investigation. 

4.9 Notes on a p and p . Variation in p from sample to sample, as measured 
by cr PJ is presumed to be unallocable, i.e., attributable only to the errors of 
sampling. In industrial data, variation in p from sample to sample is com- 
posed not only of such residual errors but of identifiable factors (in our exam- 
ple, differences in workroom humidity). It would not, however, be appropri- 
ate to include this element of variability in our estimate of «r p , for our purpose 
is to compare the actual variability in p with the variability that would be 
expected under ideal conditions (random sampling effects only). 

p actually varies for another reason, for in industrial practice samples are 
generally drawn without replacement from a finite population. Assume the 
population consists of 1QG,0G0 tubes, 5 per cent of which are defective. If the 
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first draw brings forth a bad tube, the probability of a defect is now no longer 
0.05 but 

4.999 

99.999 

which is not quite 0.05. This point is relevant if we are interested in deter- 
mining the proportion defective in the batch; but in quality control, where 
the interest is in spotting absence of control, the population may be con- 
sidered to be infinite. 



CHAPTER V 


SAMPLING AND THE RISKS OF PRODUCERS AND BUYERS 

5.1 Introduction. A lot of merchandise must often be judged 
acceptable or not on the basis of information provided by a sample drawn 
from the lot. In such cases, the producer and buyer will have to incur 
risks, respectively, of (1) having satisfactory lots rejected and (2) re- 
ceiving poor lots. If numerical values can be placed on these risks, we 
may, under certain assumptions to be stated presently, determine the 
size of sample to be examined and the value of the sample statistic which 
will differentiate acceptable from non-acceptable lots. Or, if the sample 
size and the value of the sample statistic are set by authority, the re- 
spective risks may be determined. 

5.2 Assumptions. In the methods used in this chapter, lots are 
assumed to be infinite in size relative to the samples drawn from them. 
It is also assumed that these infinite populations are approximately 
normal. For example, even though mean quality may decline from X r 
to X ", the distribution of quality is assumed to remain normal. Finally, 
the method of sampling from the lots will always be the random method. 
Further assumptions specific to particular measures of quality will be 
discussed along with those measures. 

Some experience with the methods of this chapter indicates that these 
assumptions, while severe, do not prevent effective practical usage of 
these methods. 

This procedure will now be illustrated by four common measures of 
quality: arithmetic mean, fraction defective, standard deviation, and 
the coefficient of variation, the latter being the standard deviation divided 
by the arithmetic mean. 

For example, assume that the fraction defective is used as a measure 
of quality and that the producer's lots average p. This figure may be 
acceptable to the buyer who wishes, however, to be protected against 
the receipt of inferior lots, of quality pi or greater, where pi > p, for 
product defective to this extent will materially affect his operations. 
The desired specification will state that a sample of n specimens should 
be drawn from each lot and that lot be marked satisfactory if the sample 
shows no more than c defective specimens. Such a specification must 
consider two objectives: first, as already mentioned, the buyer's risk of 
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receiving highly defective merchandise shall be small (say 1 in 100), and 
second, the producer's risk of having normally good output (of quality p) 
rejected shall also be small. 

5.3 Producer and buyer risk, using means (Dodge, 9). A lot is to 
be judged acceptable or not on the basis of the value of the average 
quality X of a sample of n pieces drawn at random from that lot. The 
producer who is presumed to be manufacturing the product at a statis- 
tically controlled or near-controlled plant average quality X ' would like 
to run a small risk P of having a lot rejected. The buyer wants to run 
a small risk B of obtaining lots whose average quality is as low as X n 
or lower. We want to determine X and n. The situation is shown 
graphically below. 


Means of samples of size 
n from population of 
tolerance quality, 
mean X'\ , 


jNear-normai buyer’s 
•‘tolerance^* population . 
of mean X , 
variance <7 2 



Means of samples of size n 
drawn from producer’s 
controlled population of 

rzpt . f 2 

mean X , variance CT 


Producer's near-normal 
population* with mean 
VjTand variance O' 3 




Quality 


The producer's requirements are given by 


X f -X X' - X 



and the buyer's requirements by 

X-X" X - X" £ 

ft ~ y — jr 

cr/Vn 

In addition to the assumptions stated in 5.2, it is assumed here that 
the lot of “ tolerance " quality X ,f has the same variance a 2 as the 
lot of usual mean quality X\ It may also be noted that if the samples 
are drawn carefully at random, strict normality is not necessary for 
effective application of the present theorems. 

Supplement B of the American Society for Testing Materials' publi- 
cation “ Manual on Presentation of Data " (1) gives the following data 
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on an operating characteristic. High values indicate high quality. 
Ranges are recorded rather than the standard deviations, for, as already 
suggested, ranges are easily calculated and for n < 15 a good estimate 
of <t can be formed from the mean range. 



Number of 
tests made 

Average quality 

Range 

1 

9 

37.6 

9.5 

2 

9 

31.4 

6.0 

3 

9 

34.7 

13.5 

4 

9 

35.8 

12.0 

5 

9 

38.5 

21.0 

6 

9 

34.2 

17.5 

7 

9 

36.1 

15.5 

8 

9 

32.3 

18.0 

9 

9 

35.0 

12.5 

10 

9 

33.9 

14.0 


Assume the producer wants to run no more than 2 chances in 100 that 
a lot will be rejected and the buyer wants to run no more than 1 chance 
in 100 that he will receive a lot with average quality less than 30. How 
many pieces n from each lot are to be tested and what is the sample 
average X which will differentiate acceptable from rejected lots? 

We have 

X' - 35 
X" = 30 
P - 0.02 
B = 0.01 

The “population ” of data meets the requirements for control laid down 
in the preceding chapter. To compute the standard deviation a from 
the ranges : we know that for small samples all of size n and drawn from a 
normal population 

Mean range 

Standard deviation 

From our data and Table VI 

Mean range = 14.0 

X = 2.970 


^ = 4.714 
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From Table IV we find, corresponding to P — 0.02 and B = 0.01, 

e = 2.054 

/ = 2.326 


Finally 


35 — X 
4.714/Vn 


2.054 


from which 


X-30 

4.714/Vn 


2.326 


X = 32.65 
n = 17.1 


i.e., a sample mean of 32.65 and a sample size of about 18. 

Inasmuch as the lot is not indefinitely larger than the sample (n = 18) 
the results may be accepted only as rough approximations. 

In place of the foregoing, the buyer may prefer to stipulate that he 
wishes to run, say, no more than 1 chance in 100 that he will receive a lot, 
a certain percentage of the items of which will be lower in quality than 
a stated value. From such a stipulation, we may easily compute X" 
and proceed as above. Thus if in the present example, the buyer wished 
to run only 1 chance in 100 of receiving lots with more than 5 per cent 
of their contents below a quality of 25, we would have what is shown 
graphically below. 



1" - 25 
4.714 


= 1.65 


or 

1" = 32.778 



RISKS OF PRODUCERS AND BUYERS 


145 


This value of %. n would now be placed in our equations and X and n 
could be found. 

5.4 Producer and buyer risk, using fraction defective. As already 
illustrated, quality is sometimes not recorded numerically but simply as 
good or bad. We want to determine the size of sample n to be randomly 
drawn for inspection or test purposes from a lot of size N and also the 
maximum number of defective pieces g the sample may contain for the 
lot still to be acceptable. The producer, as before, is presumed to be 
manufacturing the product at a statistically controlled fraction defective 
p and he wishes to run a small risk P of having his lots rejected. The 
consumer wishes to run a small risk B of receiving lots which have more 
t hen a proportion of p f defective. 

For these conditions, the producer’s and consumer’s interests are given, 
respectively, by 

r—n " ppNrtqN 

° w-r __ p 

, - r N ” 

r«= 0 - fl W 


r =o pv'Nrn'N 



- B 


where 

C** _ w 

r!(pA r — r)! 


with s imil ar expressions for the other combinatorial terms. Consider- 
ing the lot to be indefinitely large, these become 


e wr-p 

r=0+l 

T=ZQ 

E c;p ,r q ,r -- T = B 

r—Q 


where 

q = 1 - V, q' = 1 — p . 


Finally, if p and p' are under 10 per cent, which is common in industrial 
practice, and n is large (say over 100) the Poisson form of the above 
equations may be safely used. 


r f" e- pn (pnY = _ £» e-*W = p 

r4+l r\ r -0 rl 


r=tt e ~ p ’ n (p'n) r 
T m - o r! 


B 


r! 
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5.5 Example of 5.4. Supplement B of the American Society for 
Testing Materials’ “ Manual on Presentation of Data ” (1) gives the 
following data on surface defects on galvanized hardware. 


Lot number 

Sample size 

Number of 
defective 
pieces 

Lot number 

Sample size 

Number of 
defective 
pieces 

1 

580 

9 

17 

640 

3 

2 

550 

7 

18 

580 

4 

3 

5S0 

3 

19 

510 

6 

4 

640 

9 

20 

580 

8 

5 

760 

11 

21 

600 

8 

6 

760 

12 

22 

640 

12 

7 

510 

9 

23 

640 

9 

8 

550 

10 

24 

580 

8 

9 

640 

10 

25 

580 

8 

10 

640 

10 

26 

510 

4 

11 

640 

8 

27 

640 

6 

12 

640 

10 

28 

550 

8 

13 

580 

7 

29 

550 

8 

14 

580 

9 

30 

430 

3 

15 

550 

5 

31 

430 

6 

16 

430 

5 





How many pieces should be taken in each sample, and what is the 
largest number of defective pieces a sample may contain for the lot still 
to be accepted? 

The producer’s mean quality is given by 0.013 and the population 
may be shown, by the methods of the preceding chapter, to be under 
statistical control. Assume that the producer wishes to run not more 
than 1 chance in 100 (because of high manufacturing costs) of having 
lots rejected while the consumer is willing to run as many as 5 chances in 
100 (because of relative ease of replacement) of having as much as 5 
per cent of the product defective. We have 

p =0.013 

p' = 0.05 

P = 0.01 

B = 0.05 

Assume an answer, n — 200. We have 
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fr“ n e~ 2 ‘ 6 (2.6) r 

r 4+i 7\ 


o.oi 


r =n e ~ 10 °(10.0) r 

r4+i r! 


= 1 - 0.05 = 0.95 


From Figure 1, the first equation is satisfied by g + 1 = 7.5, approxi- 
mately. But the second requires g + 1 = 5.5, so n = 200 is not a solu- 
tion. By trial and error we come to n — 300 and g + 1 = 9.7 as an 
approximate solution. The sample size should be about 300 and the lot 
should not be accepted if it contains more than about nine defective 
pieces. 

Surface defects may be easily noted and at slight inspection expense. 
In such cases, 100 per cent inspection might be feasible. This would not 
be true wherever inspection was costly or where destructive testing 
was necessary. 


NOTES 

5.6 Hypergeometric law. Given a well-mixed lot of size N with a bad 
pieces and /3 ( = N— a) good pieces. A random sample of size n is drawn from 
the lot. What is the probability P that the sample contains a bad pieces and 
b good pieces? 

A sample of size n can be drawn from a lot of size N in C„ ways. Further, 
a bad pieces can be drawn from a bad pieces in C% ways. Similarly, b good 
pieces can be drawn in Cf ways. Each of sets of a bad pieces can be paired 
with each of the C? sets of h good pieces, i.e., the total number of ways in which 
a bad pieces and b good pieces can be drawn is C% • Cf . Hence the required 
probability is 

C oe /~if3 rice 
— a 1 ^ — iya 
^ fiN pG+P 

Oo va+S 

which is sometimes known as the hypergeometric law\ 

Thus, if an urn contains two white and two black balls, the probability that 
a random sample of two consists of one white and one black ball is 

gf - Cl _ 4 

Ct 6 

Fallacious answers to problems of this type can be avoided if one enumerates 
the equally likely cases. Thus our lot is 


A B C D 

0 0 • • 
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and the following are equally likely drawings for a sample size of 2 


A 

B 

B 

c 

0 

0 

0_ 

• 

A 

C 

B 

D 

0_ 

• 

0_ 

• 

A 

D 

C 

D 

0 

• 

• 

• 


four of which satisfy the requirements of the problem, i.e., 



The hypergeometric law may be looked upon as the law of compound prob- 
abilities for the case in which the several probabilities are affected by previous 
drawings. To illustrate, consider the following problem from Fry (17) : 

A batch of 1000 lamps is 5 per cent bad. If five are tested, what is the 
chance that no bad lamps will appear? 

By the hypergeometric law 


P - 


£950 £50 
£1000 " 


9501 995! 
945! 1000! 


■ 0.7734 


The probability that the first lamp is good is 950/1000. If a good lamp is 
drawn and not replaced , the probability that the second lamp drawn is good is 
949/999. Finally, as all five lamps must be good to satisfy the conditions of 
the problem, 

p - 950 i 949 1 948 < 947 > 946 
~ 1000 *999* 998 *997’ 996 


950! 995! 
10001*945! 


0.7734 as before 


5.7 Binomial approximation. If the lot size N is indefinitely larger than 
the sample size n, the probability that a lamp is bad will not vary as lamps are 
drawn. If p = 0.05 is the probability of drawing a bad lamp, the probability 
of drawing 0, 1, • * * n good lamps in a sample of n is given by the successive 
terms of the binomial 

(q + v) n 

Thus 

P = (0.95) 5 = 0.7738 

differing but slightly from the previous answer, as would be expected, for N 
is 200 times as large as n. 
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5.8 Poisson distribution. 


Given 

(q + p) H 


write 


Pr = C?g B_r p r = 


n\ 


r! (n — r)! 


P r being the probability of obtaining exactly r defectives in a random sample 
of n pieces drawn from an indefinitely larger lot. If n is large, p small, and 
m (— pn , the expected number of defects) is a small finite number, the expres- 
sion for P T can be simplified. First write 

«- . a _ . (i - s . =)- 


For r considerably less than n, the last factor will not differ appreciably from 
unity, i.e., 

q n ~ r =5 q n 

Replacing n\ and (ft — r)I by the Stirling approximations 
ft! *= v / 2 wn ft n e~ n 

(ft — r) ! = ‘V / 27r(ft — r) (n — r) w ~V~ Cn " r) 

we obtain 

r! 

_ m r / w 

r! \ ft 

rrfe"™ 

ms 

r! 


the equation of the Poisson distribution. 

In our first example n = 5, p = 1/20, pn = 1/4. The conditions for proper 
application of the Poisson approximation are not satisfied, for n is not large. 
The result indicates, however, a close approximation to the exact answer as 
given by the hypergeometric law, for 


Po = 


0 ! 


= 0.7788 


5.9 Note on Figure 1 (p. 176). In a sample of size ft, the probabilities 
of 0, 1, * * - , ft defects, as given by the Poisson law, are 


e~™ e~™m e^m 2 e^m n 

> * ? * * * » 

0 ! 1 ! 2 ! nl 
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The probability that a sample of n contains more than g defects (i.e., at least 
g + 1 defects) is 

r £ n e~ m m r 
reff+l t\ 

Figure 1 gives the probability of at least c defects for various values of m, 
that is, 

* 'y e- m m r 

he r! 


which is equivalent to 


1 


rs= y 1 e~ m m r 
ho r! 


5.10 Producer and buyer risk, using the standard deviation. Fre- 
quently, variability in the quality of a product may be even less desirable 
than low average quality. Metal strips all of about the same breaking 
strength and electric lamps all of about the same life may often be pre- 
ferred to batches of these products which are of higher mean quality but 
which contain many very good and many very bad strips or lamps. 

Crum (7) states that studies involving several hundred concrete beams 
used in paving projects in Iowa yield a standard deviation a of about 
10 per cent of the mean quality. The latter is given by a modulus of 
rupture of 760 pounds per square inch. Assume that for a certain job, 
<r" = 20% is the buyer’s tolerance variability. Producer and buyer 
want small risks, say, 1 in 100 of respectively (1) rejected lots, (2) less- 
than-tolerance quality lots. How many pieces should be drawn for test 
purposes from each lot and what should be the maximum standard 
deviation of the sample if that lot is to be accepted? 

The distribution of the variances sj of samples each of size n drawn at 
random from a normal population of variance a 2 is given by 

p(s 2 )d(s 2 ) - C(s 2 ) n ~ 312 e-** 2!2 ° 2 d(s 2 ) 


where C is a constant. The distribution of Si is immediately derivable 
and has been discussed in 1 .37. It is most convenient, so far as available 
tables are concerned, to use the fact that the function 

ns 2 

is distributed as x 2 with n— 1 degrees of freedom. Values of x 2 are 
shown in Table VII. 
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In this example, the larger the value of a, the poorer the quality. 
Hence, compared to the two previous examples, the producer’s and the 
tolerance populations are reversed in position along the horizontal axis. 



P (producer’s risk) = 0.01 


B (buyer’s risk) = 0.01 
a = 76 cr" = 152 

a' 2 = 5776 cr " 2 = 23,104 

We want to determine n and s. We have for the population of </ = 76 

t> 

[1] — — = xf> with n — 1 degrees of freedom, and for the 

1 5776 

population of tolerance quality cr" = 152 

[2] = x! with n — 1 degrees of freedom 

23,104 

In Table VII we have the probabilities of exceeding a value of x 2 for 
various degrees of freedom. We know neither x 2 nor the number of 
degrees of freedom, but we do know that we want the producer’s chance 
of exceeding x! to be 0.01 and the buyer’s chance of exceeding x| to be 
0.99. Also, from [1] and [2] 
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From Table VII, using columns headed by probabilities of 0.99 and 0.01, 
we find for the above ratio of 4 about 24 degrees of freedom, (xp = 
42.980, xb = 10.856.) Hence 

n — 25 


Substituting in either [1] or [2] we find 

s 2 = 9930, approximately 
s = 99.7 


A sample should contain 25 items and have a standard deviation of not 
more than 100 pounds per square inch for its lot to be acceptable. 

5.11 Second example of 5.10. Welch (46) has given examples in 
which the size of the sample has, as is sometimes the case, already been 
fixed by authority. A manufacturer produces electric light bulbs under 
controlled conditions with </ = 0.8. Ten bulbs are to be sampled from 
each lot. The producer is willing to incur a 5 per cent risk of having lots 
rejected whereas the buyer wants to know what protection such a sample 
will, under these conditions, give him against obtaining lots as bad as 
cr" - 1.5. 

We have 

- 15.62s 2 - xl 
c 


For a producer risk of 0.05 and for nine degrees of freedom we have 
Xp = 16.919. Hence 

s = 1.04 


Finally 


ns 2 10(1.04) 

7 71 ” (i.5) 2 ” ~ 


Forx! = 4.85 and again with nine degrees of freedom, we have P = 0.85, 
which is the chance of exceeding xi* The buyer's risk is therefore LOO 
— 0.85, or 15 per cent, a rather high risk. For better protection to him 
either the sample size n must be increased or the standard deviation </ 
reduced by more efficient plant control.* 

♦If the producer uses inspection for control, then it is easily shown that for 
or c= o.S and n = 10, a value of s - 1.04 will likely result in the lot being thrown out; 
if so, the buyer's risk is practically zero. This applies in principal to all problems in 
this chapter. We presume, however, that the buyer desires protection quite inde- 
pendent of the producer's intentions. 
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The following graph illustrates the conditions and conclusions of the 
preceding example. 



General character of distribution of standard deviations s t of samples of size n = 10 

5.12 Producer and buyer risk, using the coefficient of variation. 
Specification of average quality and variability in quality may be sepa- 
rately provided by the methods already discussed. It is sometimes 
desired to make use of one hybrid statistic which has both features. 
One is the coefficient of variation which is given by 

Standard deviation 
Arithmetic mean 

High values of this statistic will result from high variability in quality 
and low mean quality, both of which we take in our examples to be un- 
favorable. Correspondingly, low values of the coefficient of variation 
are considered favorable. 

Wilsdon (49) gives a frequency distribution shoving the crushing 
strength, in tons, of 188 tests of a brand of brick. As the original data 
are not shown, the following population parameters are estimated from 
his frequency distribution of 188 observations. 

X f = 28.1 
</ - 4.1 

or the coefficient of variation at the works is Vp — 0.1459. 

Assume that for a certain purpose a buyer is willing to accept brick of 
lower average strength and higher variability in strength. He is willing 
to run a risk of 5 chances in 100 ( B ) of receiving lots of coefficient of 
variation Vb ** 0.3. The producer whose statistically controlled output 
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is presumed to be characterized by Vp — 0.1459 wishes to run, say, no 
greater than a 1 in 100 risk (P) of having lots rejected. How many 
bricks should be tested, and what is the sample coefficient of variation v 
which divides acceptable from unacceptable lots? 

The function 

i + y 2VF 2+ 7 


is distributed approximately as x 2 with n — 1 degrees of freedom, 
approximation is sufficiently accurate for V > 1/3 and n < 6. 
We have, for producer and buyer interests respectively, 

„2 

/ ! 

+ i 


This 


nv 

IT 


nv 

1 + 


7>(f} + 1 )' 


xl 


where Vp = 0.1459 is the coefficient of variation associated with the 
producer’s ordinary output and Vp = 0.3 is the coefficient of variation 
associated with the buyer’s tolerance output. Correspondingly, xl is 
the value of x 2 associated with the producer’s risk (0.01) and xl the 
value of x 2 associated with the buyer’s risk (0.05) ; to find v and n. 

The following graphical description may be useful. 



As before, we divide one equality by the other and obtain 

48 


xl 

xl 


== l2 = 4 


Entering the x 2 tables with probabilities of 0.01 and 0.95, we find the ratio 
4 to be associated with approximately 16 degrees of freedom. Hence 

n — 17 


Substituting this value of n and the appropriate value of x 2 into either 
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of the two original equations, we find 

v — 0.202 

A sample of n = 17 should be drawn and a sample coefficient of v = 0.202 
should divide acceptable and non-acceptable lots. 

5.13 Normal approximation to % 2 . If instead of V B = 0.3, we 
had a more stringent buyers tolerance level, say, Vb = 0.2, we would 
have found 

2 

[3] ^ = 1.83 

XB 

For this ratio no satisfying number of degrees of freedom can be found 
in the tables of x 2 , i.e., ft — 1 is greater than 30. In such a case, a normal 
distribution solution is possible, for 

V2? - v 2n - 3 


is distributed normally with unit variance. 

We have 

- V2ra - 3 = 2.32 
[4] 

V2^| - \'2n - 3 = -1.65 


the values 2.32 and — 1.65 (associated with producer and buyer risks of 
0.01 and 0.05 respectively) being found from Table IV. From [3] and 
[4] we obtain 

n = 85, approximately 
v = 0.172 

As would be expected, if the buyer is to be protected against quality 
lower than Vb = 0.2 (instead of Vb = 0.3) a larger sample and a smaller 
sample coefficient of variation will be required. 

5.14 Second Example of 5.12. Examples may be given in which n 
has already been specified by an industrial agreement or by a govern- 
mental authority. Pearson (31, 6) considers a case in which Vb = 0.200 
and n — 12. At what level V p must the producer control the quality 
of his product, and what shall be the value of v which separates acceptable 
from non-aceeptable lots, in order that the buyer shall run a 1 per cent 
chance ( B ) of obtaining lots whose quality is given by a coefficient of 
variation Vb — 0.200, and the producer a 5 per cent chance ( P ) of hav- 
ing lots rejected? We have 

2 

A = 312 --' -— 

" 1 + v 2 
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For 11 degrees of freedom and for B = 0.01 (which is equivalent to 99 
chances in 100 of exceeding xi) we have from Table VII, 

xl = 3.053 

from which 


v - 0.0995 

To calculate the necessary level of control 



V P , 


we have 


from which 


n — 12 
v = 0.0995 
xl = 19.675 
V P = 0.0776 


NOTE 

5.15 Distribution of v. McKay (27) is responsible for the proof that 

1 + « 2 \F 2 / 

is distributed approximately as x 2 with n — 1 degrees of freedom. The ap- 
proximation is best when the coefficient of variation (7) of the normal popu- 
lation is small. Fieller (14) gives tfie following numerical results which show 
that as n becomes larger, the x 2 approximation improves. 


7 = !, n = 6 


Chance of sample with smaller v 


Chance of sample with larger v 


V 

True value 

X 2 theory 

V 

True value 

X s theoiy 

0.15 

0.062 

0.067 

0.48 

0.053 

0.048 

0.14 

0.047 

0.051 

0.51 

0.034 

0.030 

0.13 

0.034 

0.037 

0.54 

0.022 

! 0.019 

0.12 

0.024 

0.026 

0.57 

0.013 

0.012 

0.11 

0.017 

0.018 

0.60 

0.008 

0.007 

0.10 

0.011 

0.012 

0.63 

0.005 

0.004 

0.09 

0.007 

0.007 

0.66 

0.003 

0.003 

0,08 

0.004 

0.004 

0.69 

0,002 

0.002 

0.07 

0.002 

0.002 

0.72 

0.001 

0.001 

0.06 

0.001 

0.001 

0.75 

0.001 

0.001 
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V = n = 18 


Chance c 

>f sample with smaller v 

Chance of sample with larger v 

V 

True value 

X 2 theory 

V 

True value 

X 2 theory 

0.24 

0.0S4 

0.088 

0.42 

0.060 

0.058 

0.23 

0.058 

0.061 

0.43 

0.046 

0.044 

0.22 

0.038 

0 040 

0.44 

0.035- 

0.033 

0.21 

0.024 

0.026 

0.45 

0.026 

0.024 

0.20 

0.014 

0.015 + 

0.46 

0 019 

0.018 

0.19 

0.008 

0.009 

0.47 

0.014 

0.013 

0.18 

0.004 

0.005- 

0.48 

0 010 

0.009 

0.17 

0.002 

0.002 

0.49 

0 007 

0.007 

0.16 

0.001 

0.001 

0.50 

0.005- 

0 005- 

.... 



0.51 

0.003 

0 003 

.... 



0.52 

0.002 

0.002 

.... 


..... 

0.53 

0.002 

0.002 

■ * ■ . 



0.54 

0.001 

0.001 

— 



0,55 

0.001 

0 001 
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TABLE I 

Probability Points op \/bI 



Size of 
sample n 

Probability points 

Size of 
sample n 

Probability points 

Size of 
sample n 

Probability points 

5% 

1% 

5% 

1% 

5% 

1% 

25 

0.711 

1.061 

200 

0.280 

■BB1 

1000 

0.127 

■ 

30 

0.661 

0.982 

250 

0.251 

0.360 

1200 

0.116 


35 

0.621 

0.921 

300 

0.230 

0.329 

1400 

0.107 

■ 

40 

0.587 

0.869 

350 

0 213 

0 305 

1600 

0.100 

0.142 

45 

* 0 558 

0.825 

400 

0 200 

0.285 

1800 

0.095 

0.134 

50 

0.533 

0.787 

450 

0.188 

0.269 

2000 

0.090 

0.127 




500 

0.179 

0.255 




60 

0.492 

0.723 

550 

0.171 

0 243 

2500 

0,080 

0.114 

70 

0.459 

0.673 

600 

0.163 


3000 

0.073 

■ 

80 

0.432 

0.631 

650 

0.157 

0.224 

3500 

1 0.068 


90 

0.409 

0.596 

700 

0 151 


4000 

0.064 


100 

0.3S9 

0.567 

750 

0.146 

0.20S 

4500 

0 060 





800 

0 142 

0.202 

5000 

0.057 

0.081 

125 

0.350 

0.508 

850 

0.138 

0 196 




150 

0 321 

0 464 

900 

0 134 

0 190 




175 

0.298 

0.430 

950 

0.130 

0.185 




200 

0.280 

0 403 

1000 

0.127 

0 180 





Note. As the sampling distribution of \/bi is symmetrical about zero, the same values, with nega- 
tive sign, correspond to the lower limits. 


TABLE H 

Probability Points of b 2 



Size of 
sample 
n 

Probability points 

Size of 
sample 
n 

Probability point® 

Upper 

1% 

Upper 

5% 

Lower 

5% 

Lower 

1% 

Upper 

1% 

Upper 

5% 

Lower 

5% 

Lower 

1% 

200 

3.98 

3.57 


2.37 

1000 

3.41 

3.26 

2.76 

2.68 

250 

3.87 

3.52 


2,42 

1200 

3.37 

3.24 

2.78 

2.71 

300 

3.79 

3.47 


2.46 

1400 

3.34 

3.22 

2.80 

2.72 

350 

3.72 

3.44 

2.62 

2.50 

1600 

3.32 

3.21 

2.81 

2.74 

400 

3.67 

3.41 

2.64 

2.52 

1800 

3.30 


2.82 

2.76 

450 

3.63 

3.39 

2.66 

2.55 

2000 

3.28 

3.18 

2. S3 

2.77 

500 

3.60 

3.37 

2.67 

2.57 






550 

3.57 

3.35 

2.69 

2.58 

2500 

3.25 

3.16 

2.85 

2.79 

600 

3,54 

3.34 

2.70 

2.60 

8000 

3.22 

3.15 

2.86 

2.81 

650 

3.52 

3,33 

2.71 

2.61 

3500 

3.21 

3.14 

2.87 

2.82 

700 

3.50 

3.31 

2.72 

2.62 

4000 

3.19 

3.13 

2. 88 

2.83 

750 

3.48 

3.30 

2.73 

2.64 

4500 

3.18 

3.12 

2.88 

I 2.84 

800 

3.46 

3.29 

2.74 

2,65 

5000 

3.17 

3.12 

2.89 

i 2.85 

850 

3.45 

3.28 

2.74 

2.66 






900 

3,43 

3.28 

2.75 

2.66 






95p 

3.42 

3.27 

2.76 

2.67 






1000 

3.41 

3.26 

2.76 

2. 68 
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TABLE III 

Probability Points of a 


Size of 
sample 

ft 

Probability points 

Mean 

Upper 

1% 

Upper 

5% 

Upper 

10% 

Lower 

10% 

Lower 

5% 

Lower 

1% 

11 

0.9359 

0.9073 

0.8899 

0.7409 

0.7153 

0 6675 

0.81805 

16 

0.9137 

0 8884 

0.8733 

0.7452 

0.7236 

0 6S29 

0.81128 

21 

0.9001 


0.8631 

0.7495 

0.7304 

0 6950 

0 80792 

26 

0.8901 

0.8686 

0.8570 


0 7360 

0.7040 

0.80590 

31 

0.8827 

0.8625 

0.8511 

0.7559 

0.7404 

0.7110 

0.80456 

36 

0.8769 

0.8578 

0.8468 

0.7583 

0.7440 

0 7167 

0.80360 

41 

0.8722 

0.8540 

0.8436 

0.7604 

0.7470 

0 7216 

0.80289 

46 

0.8682 

0.8508 


0 7621 

0 7496 

0.7256 

0.80233 

51 

0.8648 

0.8481 


0.7636 

0.7518 

0 7291 

0.80188 

61 

0.8592 

0.8434 

0.8349 

0.7662 

0.7554 

0.7347 

0.80122 

71 

0.8549 

0.8403 

0.8321 

0.7683 

0.7583 

0.7393 

0.80074 

81 

0.8515 

0.8376 

0.8298 

0.7700 

0.7607 

0.7430 

0.80038 

91 

0.8484 

0.8353 

0.8279 

0.7714 

0.7626 

0.7460 

0.80010 

101 

0.8460 


0,8264 

0.7726 

0.7644 

0.7487 

0.79988 

201 

0,8322 

0,8229 

0.8178 

0.7796 

0,7738 

0.7629 

0.79888 

301 

0.8260 

0.8183 

0.8140 

0.7828 

0.7781 

0 7693 

0.79855 



0.8155 

0.8118 

0.7847 

0.7807 

0 7731 

0.79838 

501 

0.8198 

0.8136 


0.7861 

0.7825 

0.7757 

0.79828 

601 

0.8179 

0.8123 

0.8092 

0.7873 

0.7838 

0.7776 

0.79822 

701 

0.8164 

0.8112 

0.8084 

0.7878 

0.7848 

0.7791 

0 79817 

801 

0.8152 

0.8103 

0.8077 

0.7S8S 

0.7857 

0.7803 

0,79813 

901 

0.8142 

0.8096 

0.8071 


0.7864 

0.7814 

0.79811 

1001 

0.8134 

0.8090 

0.8066 

0.7894 

0.7869 

0,7822 

0.79808 











TABLE IV 

Normal Distribution Areas 


0 


£ 

c 

0.00 

0.01 


0.03 

0.04 

0 05 

0 06 

0.07 

0.08 

0.09 

0.0 

0.0000 

0 0040 

0 0080 

0.0120 

0 0159 

0 0199 

0.0239 

0 0279 

0.0319 

0 0359 

0.1 

0.0398 

0 0438 

0 0478 

0 0517 

0.0557 

0 0596 

0 0636 

0.0675 

0 0714 

0 0753 

0.2 

0.0793 

0 0832 

0 0871 

0 0910 

0 0948 

0.0987 

0 1026 

0 1064 

0.1103 

0.1141 

0.3 

0.1179 

0 1217 

0 1255 

0 1293 

0.1331 

0.1368 

0.1406 

0.1443 

0 1480 

0 1517 

0.4 

0.1554 

0.1591 

0.1628 

0 1664 

0.1700 

0.1736 

0 1772 

0 . 180 S 

0 1844 

0.1879 

0.5 

0.1915 

0.1950 

0.1985 

0.2019 

0.2054 

0 2088 

0.2123 

0.2157 

0 2190 

0.2224 

0.6 

0.2257 

0.2291 

0.2324 

0.2357 

0 2389 

0 2422 

0 2454 

0 2486 

0 2518 

0 2549 

0.7 

0.2580 

0 2612 

0 2642 

0 2673 

0.2704 

0 2734 

0 2764 

0 2794 

0.2823 

0 2852 

0.8 

0.2881 

0.2910 

0.2939 

0.2967 

0 2995 

0.3023 

0.3051 

0 3078 

0 3106 

0 3133 

0.9 

0.3159 

0.3186 

0 3212 

0.3238 

0.3264 

0 3289 

0 3315 

0.3340 

0.3365 

0.3389 

10 

0.3413 

0.3438 

0.3461 

0.3485 

0 3508 

0.3531 

0.3554 

0.3577 

0.3599 

0 3621 

1.1 

0.3643 

0 3665 

0.3686 

0.3718 

0 3729 

0.3749 

0 3770 

0 3790 

0.3810 

0 3830 

1.2 

0.3849 

0.3869 

0 3888 

0.3907 

0 3925 

0 3944 

0 3962 

0 3980 

0.3997 

0 4015 

1.3 

0.4032 

0.4049 

0 4066 

0 4083 

0.4099 

0 4115 

0 4131 

0 4147 

0 4162 

0 4177 

1.4 

0.4192 

0.4207 

0 4222 

0 4236 

0 4251 

0.4265 

0.4279 

0 4292 

0.4306 

0 4319 

1.5 

0.4332 

0.4345 

0 4357 

0.4370 

0 4382 

0.4394 

0.4406 

0.4418 

0 4430 

0 4441 

1.6 

0.4452 

0 4463 

0.4474 

0 4485 

0.4495 

0.4505 

0 4515 

0 4525 

0 4535 

0.4545 

1.7 

0.4554 

0.4564 

0.4573 

0.4582 

0.4591 

0 4599 

0 4608 

0.4616 

0 4625 

0.4633 

1.8 

0.4641 

0 4649 

0.4656 

0.4664 

0 . 4 G 71 

0 4678 

0 4686 

0.4693 

0 4699 

0.4706 

1.9 

0.4713 

0.4719 

0 4726 

0.4732 

0.4738 

0.4744 

0.4750 

0.4758 

0 4762 

0 4767 

2.0 

0.4773 

0.4778 

0 4783 

0.4788 

0.4793 

0.4798 

0.4803 

0 4808 

0 4812 

0.4817 

2.1 

0.4821 

0.4826 

0.4830 

0.4834 

0 4838 

0.4842 

0 4846 

0 4850 

0.4854 

0 4857 

2.2 

0.4861 

0.4865 

04868 

0.4871 

0.4875 

0.4878 

0.4881 

0 4884 

0 4887 

0.4890 

2.3 

! 0.4893 

0.4896 

0.4898 

0.4901 

0.4904 

0 4906 

| 0.4909 

0 4911 

j 0 4913 

0.4916 

2.4 

0.4918 

0.4920 

0.4922 

0.4925 

0.4927 

0.4929 

0.4931 

0.4932 

0.4934 

0.4936 

2.5 

0.4938 

0.4940 

0.4941 

0 4943 

0 4945 

0.4946 

0.4948 

0.4949 

0 4951 

j 0.4952 

2.6 


0.4955 

0.4956 

0.4957 

0.4959 

0.4960 

| 0.4961 

0.4962 

0.4963 

0.4964 

2.7 

0.4965 

0.4966 

0.4967 

0 4968 

0.4969 

0.4970 

0.4971 

0 4972 

0 4973 

0 4974 

2.8 

0.4974 

0.4975 

0.4976 

0.4977 

0 4977 

0.4978 

0.4979 

0 49 S 0 

j 0 4980 

04981 

2.9 

0.4981 

0.4982 

0 4983 

0.4984 

0.4984 

0.4984 

0 4985 

0.4985 

i 0.4986 

1 0.4986 

3.0 

0.49865 

0.4987 

0.4987 

0.4988 

0.4988 

0.4988 

0.4989 

0.4989 

0.4989 

0.4990 

3.1 

3.2 

3.3 

3.4 

3.5 

3.6 

3.7 

3.8 

3.9 

4.0 

4.5 

5.0 

0.49903 

0.4993129 

0,4995166 

0.4996631 

0.4997674 

0.4998409 

0.4998922 

0.4999277 

0,4999519 

0.4999683 

0.4999966 

0.4999997133 

0.4991 

0.4991 

0.4991 

0 4992 

0 4992 

i 

J 

0 4992 

1 

0 4992 

0.4993 

0.4993 
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Reproduced from Statistical Methods for Research Workers , Gth ed. f with the permission of the author, R. A. Fisher, and hia publisher, Oliver and Boyd, Edinburgh, 
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INDUSTRIAL STATISTICS 


TABLE VI 

Ratio of the Mean Range to Standard Deviation 


The ratio of mean range of samples of size n to a- of the normal population from which 

they are drawn 


n 

Mean range 

or 

n 

Mean range 

O' 

n 

Mean range 

O 

n 

Mean range 

o 

O 


10 

3.07751 

20 

3.73495 

30 

4.08552 

1 


11 

3.17287 

21 

3.77834 

31 

4 11293 

2 

1.12838 

12 

3.25846 

22 

3 81938 

32 

4 13934 

3 

1.69257 

13 

3.33598 

23 

3.85832 

33 

4 16482 

4 

2.05875 

14 

3 40676 

24 

3.89535 

34 

4. 18943 

5 

2.32593 

15 

3 47183 

25 

3.93063 

35 

4 21322 

6 

2.53441 

16 

3 53198 

26 

3.96432 

36 

4.23625 

7 

2.70436 

17 

3.58788 

27 

3.99654 

37 

4.25855 

8 

2.84720 

18 

3 64006 

28 

4.02741 

38 

4.28018 

9 

2.97003 

19 

3 68896 

29 

4.05704 

39 

4.30117 


n 

Mean range 

cr 

n 

Mean range 

cr 

n 

Mean range 

cr 

n 

Mean range 

O 

40 

4.32156 

85 

4.89789 

150 

5.29849 

400 

5.93636 

45 

4.41544 

90 

4.93940 

160 

5.34244 

450 

6.00903 

50 

4.49815 

95 

4.97841 

170 

5.38344 

500 

6.07340 

55 

4.57197 

100 

5.01519 

180 

5.42186 

600 

6.18340 

60 

4.63856 

105 

5 04997 

190 

5.45799 

700 

6.27510 

65 

4.69916 

110 

5.08295 

wmw 

5.49209 

800 

6.35358 

70 

4.75472 

120 

5.14417 

250 

5.63837 

900 

6.42211 

75 

4.80598 

130 

5. 19996 

IKiiM 

5.75553 

1000 

6.48287 

80 

4.85355 

m 

5.25118 

350 

5.85302 













For degrees of freedom greater than 30, the expression v'ifri - V&?~^ 1 may be used ae a normal deviate with unit variance, where n' is the number of degrees 
01 Reprodaoed from StafUti aU Method, 1 or Rueorch Worker,, Oth ed., with the permission of the author, It. A. F.ehcr, and his publisher, Oliver and Boyd, Edinburgh. 
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Fiqubb 1, Probability curves showing Poisson’s exponential summation 
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Binomial, 133, 136, 137, 148 
Block, randomized, 57 
analysis of variance in, 64 

Charlier check, 31 
Chi square, 150 
normal approximation to, 155 
Coefficient of variation, 153, 156 
Control, kinds of, 3, 53 
limits for, 139 

Correlation, coefficient of (r), 49 
tests showing, 48, 49 
Covariance, analysis of, 120 
See also Variance 
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Deviation, mean, 10, 12 
standard, 9, 12 

Distribution, normal, 9, 33, 37 
Poisson, 149 

Error, standard, 15 
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applied to regression, 98 
Fraction defective (p), 133, 137, 145 
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Hypergeometric law, 147 

Interaction, 76 

Lo test, 86 
Lx test, 17, 86 

Latin Square, analysis of variance in, 63 
features of, 54, 66 


Least squares, application of method of, 
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Mean, arithmetic, 3 
variance of, 42 

Means, difference among several, 52 
difference of two, 1 
distribution of, 38 
Moments, 10, 34 
computation of, 36 
correction of, 32 

p (fraction defective), 133, 137, 145 
Pairing, 4 

Poisson distribution, 149 
Population, 1 

formation and homogeneity of, 127 
normality of, 9, 28 
variance of, 12 

Quality, characteristics of, 7 
control of, 126 

r (correlation coefficient), 49 
Randomization, 5, 53 
Randomized block, 57 
analysis of variance in, 64 
Randomized experiments, 53 
analysis of variance in, 58 
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Regression, analysis of, 96, 98, 102 
coefficient of, 121 
curvilinear, 96, 110 
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line of prediction of, 113 
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s 2 (variance), 45 

Selection of specimens, methods of, 6 
Size, of experiment, 7 
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t test, 14, 44, 47, 48 
after F, 62 
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Variance ($ 2 ), estimate of, 13, 43, 46, 
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of fraction defective, 133 
of mean, 42 
of an observation, 12 
Variation, coefficient of, 153, 156 
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normal approximation to, 155 



